I have the following code to recast a set of medical records so that the new PDF has bookmarks that reflect the date of medical treatment. The code creates a new PDF but the pdf has no bookmarks. Thank you in advance for any help.
import re
import pypdf
Open the PDF file for reading
pdf_file = open(“C:/Users/stand/Downloads/SusanJonesVisitDate.pdf”, ‘rb’)
pdf_reader = pypdf.PdfReader(pdf_file)
pdf_writer = pypdf.PdfWriter()
Define the regular expression for finding the bookmark locations
regex = re.compile(r’Visit date: \b\d{2}/\d{2}/\d{4}\b’)
Iterate through the pages of the PDF
for i in range(len(pdf_reader.pages)):
page = pdf_reader.pages[i]
text = page.extract_text()
matches = re.finditer(regex, text)
pdf_writer.add_page(page)
for match in matches:
pdf_writer.add_outline_item( title=matches, page_number=page)
Write the new PDF file with bookmarks
output_file = open(‘porejemplo.pdf’, ‘wb’)
pdf_writer.write(output_file)
output_file.close()
pdf_file.close()