Python: Replace Annotated Hyperlinks from PDF files

Product: PDFTron

Product Version:

Hi,

I’m using PDFNetPython3 Python lib to convert some annotated Hyperlinks from the PDF files. I want to replace all thy hyperlinks.

import os
import argparse
from PDFNetPython3 import PDFDoc, PDFNet, ContentReplacer, SDFDoc
from PyPDF2 import PdfFileReader, PdfFileWriter
from PDFNetPython3 import *

def AnnotationHighLevelAPI(doc):
    # The following code snippet traverses all annotations in the document
    print("Traversing all annotations in the document...")
    page_num = 1
    itr = doc.GetPageIterator()
    
    while itr.HasNext():
        print("Page " + str(page_num) + ": ")
        page_num = page_num + 1
        page = itr.Current()
        num_annots = page.GetNumAnnots()
        i = 0
        while i < num_annots:
            annot = page.GetAnnot(i)
            if not annot.IsValid():
                continue
            print("Annot Type: " + annot.GetSDFObj().Get("Subtype").Value().GetName())
            
            bbox = annot.GetRect()
            formatter = '{0:g}'
            print("  Position: " + formatter.format(bbox.x1) + 
                  ", " + formatter.format(bbox.y1) +
                  ", " + formatter.format(bbox.x2) + 
                  ", " + formatter.format(bbox.y2))
            
            type = annot.GetType()
            
            if type == Annot.e_Link:
                link = Link(annot)
                action = link.GetAction()
                print(link)
                if not action.IsValid():
                    continue
                    
                if action.GetType() == Action.e_URI:
                    uri = action.GetSDFObj().Get("URI").Value().GetAsPDFText()
                    print("  Links to: " + str(uri))
                    
#                   # Replace a hyperlink...
                    hyperlink = Link.Create(doc.GetSDFDoc(), Rect(bbox.x1, bbox.y1, bbox.x2, bbox.y2), Action.CreateURI(doc.GetSDFDoc(), "http://www.pdftron.com"))
                    page.AnnotPushBack(hyperlink)
        
            i = i + 1
        itr.Next()

PDFNet.Initialize("")
doc = PDFDoc("annotated.pdf")
doc.InitSecurityHandler()

AnnotationHighLevelAPI(doc)
doc.Save(("annotated_1.result.pdf"), SDFDoc.e_linearized)
doc.Close()

And here is the output;

Annot Type: Stamp
  Position: 240.8, 647, 260, 679
Annot Type: Link
  Position: 404, 418, 524, 448
<PDFNetPython.Link; proxy of <Swig Object of type 'pdftron::PDF::Annots::Link *' at 0x7f5de0245cc0> >
  Links to: https://google.com/folder/1/A2
Annot Type: FreeText
  Position: 404, 418, 524, 448
Annot Type: Link
  Position: 212, 404, 332, 434
<PDFNetPython.Link; proxy of <Swig Object of type 'pdftron::PDF::Annots::Link *' at 0x7f5de018c8a0> >
  Links to: https://google.com/folder/1/A3
Annot Type: FreeText
  Position: 212, 404, 332, 434

Now as i got the links behind the annotations, and wants to replace google.com with my domain.com.

I’m trying to replace the links here by putting the annotations push back, but its not working.

Hello, I’m Ron, an automated tech support bot :robot:

While you wait for one of our customer support representatives to get back to you, please check out some of these documentation pages:

Guides:APIs:Forums:

Typically hyperlinks in a PDF are actually an invisible Link annot, and the text you see is actually part of the page itself.

So to change the URL of the link you would start with our Annotation sample, and edit the URI key.

To change the actual text, yes you would use ContentReplacer class.

If the above does not help, then please provide an example PDF file, and clearly indicate which hyperlinks you want to change.

1 Like

Hy @Ryan Check the updated Questions, I’m able to get the links behind the annotations but why i try to insert the updated link back to the same position and it does not work.

Ideally you can provide an example file that I can review.

If you cannot do that then please provide your code and provide a screenshot showing what you got and clearly indicate what you expected to get.