Q: I would like to search and highlight PDF based on user input. For
example:
1) do a word/phrase search through an entire pdf. For instance, I
want to know on which pages the phrase "game theory" appears in a
given PDF. I want to show those pages on which the phrase appears on
a web page.
2) once I find the pages that have the phrase "game theory" the user
will choose a page (let's say page 5) on which the phase appears. We
want to now highlight all instances of "game theory" on that page in
yellow, and then:
3) turn that highlighted pdf page into an image and show the image on
the web page
Please let me know if I can use PDFNet (http://www.pdftron.com/net)
for this task.
----
A: PDFNet SDK can be used to implement search and highlight on PDF
documents.
For text search you could use 'pdftron.PDF.TextExtractor' class as
shown in TextExtract sample project (http://www.pdftron.com/net/
samplecode.html#TextExtract). Besides finding specific words
TextExtract will return positioning information (bounding box), style,
and other properties for each word/character. This information can be
used to highlight text
To convert (highlighted) PDF pages to JPEG (or other image formats)
you could use PDFDraw class as shown in PDFDraw sample (http://
www.pdftron.com/net/samplecode.html#PDFDraw).
We have many clients how have implemented this type of solution using
PDFNet. As a starting point you may want to use the sample code
provided in the following article:
C#: http://groups.google.com/group/pdfnet-sdk/browse_thread/thread/4625f9567d1b34be/09b32d05716996fd
JAVA: http://groups.google.com/group/pdfnet-sdk/browse_thread/thread/d078562409fdced1/40fb134228fb0c48
http://groups.google.com/group/pdfnet-sdk/browse_thread/thread/384aa4ccf6f91103/ed3799a06e3edb16
If you are looking for a stand-alone PDF viewer component for
interactive search and highlighting you may want to take a look at
pdftron.PDF.PDFView class (see http://www.pdftron.com/net/samplecode.html#PDFView).
PDFView class has built-in text search and highlighting.