Implementing a PDF semantic engine

Q: I'm working on a semantic engine and I'd like to introduce a
PDFdocument manager able to recognize the tables in a pdf document.

Is PDFNet library currently able to extract these information?
Otherwise, is it possible to extract spatial information about words,
images and graphical shapes?
A: You can use PDFNet SDK ( to extract any
information from PDF documents (including text, images, positioning
information, graphics state attributes, etc).

As a starting point for your project you may want to take a look at
the following samples:


For PDF documents that contain logical structure (i.e. explicit
semantic information) you can also use a high-level logical structure
API: see LogicalStructure (