The PDF is 62 pages in total.
For this PDF, we have created a dictionary similar to the following:The contents are the name and the number of pages to save.
pdf_dic={
'tokyocaffe': 3,
'yokohamabook': 10,
'saitamahouse': 5,
'tokyoshline': 19,
'aichicoffee': 7,
'Fukuokfood': 9,
'tokyobook': 3,
'kyotocaffe': 2,
'shigafood': 3,
'tokyogoods' : 1
}
I would like to create PDFs in order such as pages 1-3 and pages 4-13 as yokohamabook.pdf.
I can split each page using PyPDF2, but I don't know how to handle the contents of the dictionary.
If you understand, please let me know.
You can split multiple pages by specifying a tuple of (start position
, end position+1
) in the PyPDF2.PdfMerger.append
argument.(Start and end positions are 0 start)
You can create the desired code by combining it with a dictionary type.
sample code
単一Download Download the Digital Agency's guidelines for open data (all 7 pages) so that it works with a single code.( pip install requests
)
import pathlib
# Download open data guidelines if no files are available
file_name="20220523_resources_data_guideline_01.pdf"
if not pathlib.Path(file_name).exists():
import requests # required pip install requests
url="https://www.digital.go.jp/assets/contents/node/basic_page/field_ref_resources/f7fde41d-ffca-4b2a-9b25-94b8a701a037/7c57e1a9/20220523_resources_data_guideline_01.pdf"
res=requests.get(url)
with open(file_name, "wb") asf:
f.write(res.content)
import PyPDF2
pdf_dic = {
'hoge': 3,
'fuga': 2,
'piyo': 2,
}
start = 0# start position
for key in pdf_dic:
merge = PyPDF2.PdfMerger()
end = start + pdf_dic [key] # End position +1
merge.append(file_name, pages=(start,end))# Extract multiple pages
merger.write(f"{key}.pdf")#Save
start=end
© 2024 OneMinuteCode. All rights reserved.