Q:
How can I programmatically highlight words on a PDF page based on a
search query?
We would like to be able to highlight words that were used for
searching.
----
A:
One option is to use TextExtractor class to search for words on a
given PDF page. The search result includes bounding box of the word as
well as text styles etc. This information can be used to
programmatically highlight the word in PDFView class using
pdf.SelectByRect() method. The following is the pseudocode written in
VB.NET, however the same approach works in any other language
supported by PDFNet (i.e. C#, Java, C/C++, etc).
Dim doc As PDFDoc = _pdfview.GetPDFDoc()
Dim itr As PageIterator = doc.PageFind(n)
Dim pg As Page = itr.Current()
Dim txt As TextExtractor = New TextExtractor
txt.Begin(pg)
Dim word As TextExtractor.Word
Dim line As TextExtractor.Line = txt.GetFirstLine()
Dim WordToHighLite As String
Dim HighLiteWord As Boolean
WordToHighLite = WordToHighLite.ToUpper
While line.IsValid()
word = line.GetFirstWord()
While word.IsValid()
Debug.Print(word.GetString())
If InStr(word.GetString.ToUpper, WordToHighLite) > 0 Then
HighLiteWord = True
Exit While
End If
word.GetNextWord()
End While
If HighLiteWord Then Exit While
line.GetNextLine()
End While
If HighLiteWord Then
Dim bbox As Rect = word.GetBBox
_pdfview.ConvScreenPtToPagePt(bbox.x1, bbox.y1, n)
_pdfview.ConvScreenPtToPagePt(bbox.x2, bbox.y2, n)
_pdfview.SelectByRect(bbox.y1, bbox.y1, bbox.x2, bbox.y2)
End If
In case you would like to permanently highlight the words, so that the
highlight is part of the PDF document itself, please search for the
following article in the Knowledge Base:
"How to search and highlight text using PDFNet"
http://groups.google.com/group/pdfnet-sdk