Preserving font when editing text data?

Aaron_Gravesdale · October 17, 2007, 11:31pm

Q:

Is it possible to perserve font type when we use the reader function
and to read in text element, then use the writer function to write
this information on a new page or new document?

element.GetType() == Element.Type.e_text char[] text_char =
element.GetTextString();
---

A:

The problem is that element.GetTextString() returns Unicode mapped
string, but the PDF font may or may not use Unicode encoding (also the
font may be subsetted - meaning that only a subset of glyphs used on
the page are present in the font, etc). The actual text encoding as
used in the page content stream can be obtained using
element.GetTextData(). Unfortunately this data is often not very
useful because the associated font can assume use a number of
different encodings. Usually the best approach is to create a new font
(possibly based on the existing font - font.GetEmbeddedFont() - then
recreate the font) and then write Unicode text using the new font.

Aaron_Gravesdale · October 18, 2007, 4:55pm

Q:

I did not seem to locate a way to determine what the font is when we
read in the String. Is there a way to copy the page then replicate it
by sections of the copy?
Basically we are looking to read in a page take a section of the page
and replicated on a newly created page all the while perserving the
font.
---

A:

Using PDFNet you can copy parts of the page to another page. As a
starting point for your project you may want to take a look at
ElementEdit www.pdftron.com/net/samplecode.html#ElementEdit) and
EditText (www.pdftron.com/net/samplecode.html#EditText) sample
projects. Essentially you can copy element by element (with all
associated fonts and other attributes) from one page to another.

The font associated with the text element can be accessed through the
graphics state. For example, element.GetGState().GetFont().