How do I embed fonts for existing PDFs with missing fonts? How do I replace a font in an existing PDF?

Q: How do I embed fonts for existing PDFs with missing fonts? How do I
replace a font in an existing PDF?

We are trying to substitute the fonts using following code.

   PDFNet.Initialize();
   PDFNet.AddFontSubst("Times-Roman", "C:/WINNT/Fonts/times.ttf");
   ...
    PDFDoc sourceDoc=new PDFDoc(sourceFile);
    Page page=null;
    for( ... each page ...) {

    Obj res = page.GetResourceDict();
    if (res != null) {
     Obj fonts = res.FindObj("Font");

     if (fonts != null)
     {
      DictIterator itr = fonts.DictBegin();
      DictIterator end = fonts.DictEnd();

      for (; itr!=end; itr.Next())
      {
       Obj fnt_dict = itr.Value();
       pdftron.PDF.Font font = new pdftron.PDF.Font(fnt_dict);
       if (font.IsEmbedded()) continue;
      }
     }
    }
   sourceDoc.Save( destFile, Doc.SaveOptions.e_linearized);
   sourceDoc.Close();
-----
A: This is possible, but is not as simple as it may seem at first
look.

PDFNet.AddFontSubst function is used to override default font
substitution that occurs during processing of PDF documents with
missing fonts. This can be useful in situations where you would like
to use a specific font instead of the fonts selected by PDFNet.
AddFontSubst neither embeds the given font nor it modifies the PDF.

In order to embed a new font in a new or existing PDF, you could call
(Font.Create??). For example:

pdftron.PDF.Font new_font =
pdftron.PDF.Font.CreateTrueTypeFont(pdfdoc.GetSDFDoc(), "myfont.ttf",
true, false);

To replace the existing font with the new font you could use the
SDFDoc.Swap() method. For example:

pdfdoc.GetSDFDoc().Swap(new_font.GetSDFObj().GetObjNum(),
old_font.GetSDFObj().GetObjNum());

In some cases this will work, but it will fail if the two fonts have
different encodings. In PDF format, text can be represented using many
different types of encodings (including custom encodings). This
information is stored as the 'Encoding' entry in the font dictionary
(for more info, please see Section 5.5.5 'Character Encoding' in PDF
Reference Manual - http://www.pdftron.com/downloads/PDFReference16.pdf).
As a result, simple font swapping will work only if fonts encodings
match; otherwise the modified PDF document may contain junk text. To
resolve this issue you would need to create a new font encoding that
maps old-char codes to WinAnsiEncoding/Unicode and update the Encoding
entry in the font dictionary (e.g.
new_font.GetSDFObj().PutName("Encoding", "MacRomanEncoding") or
new_font.GetSDFObj().Put ("Encoding", encoding_dict)). Font method
'MapToUnicode' can be useful when creating the new Encoding
dictionary. The pseudocode may look along the following lines:

pdftron.SDF.Obj RemapFontEncoding(pdftron.PDF.Font old_font)
{
  Obj encoding_dict = pdfdoc.CreateIndirectDict();
  Obj diffs = encoding_dict.PutArray("Differences");
  for (int i=0; i<256; ++i) {
    Unicode uni = old_font.MapToUnicode(i);
    String glyph_name = GetGlyphname(uni)
    diffs.PushBackNumber(uni);
    diffs.PushBackName(glyph_name);
  }
}

String GetGlyphname(int uni) {
   // given a Unicode code-point search the AGL (Adobe Glyph List)
   // http://www.adobe.com/devnet/opentype/archives/glyph.html
   // and return the corresponding glyph name.
}

...

pdftron.PDF.Font new_font =
pdftron.PDF.Font.CreateTrueTypeFont(pdfdoc.GetSDFDoc(), "myfont.ttf",
true, false); Obj new_font_obj = new_font.GetSDFObj(); // Modify font
Encoding new_font_obj.Put("Encoding", RemapFontEncoding(old_font)); //
Copy widths array...
Obj widths = old_font.GetSDFObj().FindObj("Widths");
if (widths != null) {
   new_font_obj.Put("Widths", widths);
}

// Swap the two fonts
pdfdoc.GetSDFDoc().Swap(new_font.GetSDFObj().GetObjNum(),
old_font.GetSDFObj().GetObjNum());

pdfdoc.Save(...)

Q: Thanks for the reply. We tried font Embedding using the given code.
The font embedding is working but after Font Embedding the file size
is increased drastically.

Please let us know am I missing any thing in Font Embedding.


A: This is as expected. PDFs that contain embedded fonts are larger
than PDFs that are missing fonts. One way to decrease the size of
embedded font is to subset the font (the last flag in
Font.CreateTrueTypeFont() method). When font subsetting flag is
enabled, PDFNet will start tracking glyphs that are referenced from
PDF pages. When the document is saved PDFNet will then remove all
unused glyphs. As a result the decrease in filesize will be
proportional to the number of unused/unreferenced glyphs.

One thing to keep in mind is that if you simply swap an existing font
with a new font, PDFNet will not know which glyphs were previously
referenced from the document. As a result PDFNet will remove all
glyphs from the font and the PDF will not display properly. In order
to apply font subsetting when swapping or replacing existing PDF fonts
you would also need to find the subset of font used in the document by
rewriting page content. The process of page rewriting will ‘notify’
the newly embedded font regarding which glyphs are used in the file.
The function may look as follows:

void FindFontSubset(PDFDoc doc)
{
ElementWriter writer = new ElementWriter();
ElementReader reader = new ElementReader();
Element element;
PageIterator itr = doc.GetPageIterator();
for (; itr.HasNext(); itr.Next()) { // Read every page
Page page = itr.Current();
reader.Begin(page);
Page tmp_page = doc.PageCreate();
writer.Begin(tmp_page);
while ((element = reader.Next()) != null) {
writer.WriteElement(element);
}
writer.End();
reader.End();
}
}

Please note that all new pages and page content will be discarded
because the page is actually not added to the document (using
doc.PagePushBack() etc). FindFontSubset(doc) can be called just before
calling pdfdoc.Save(…)

Another way you could avoid embedding fonts is by using/replacing
missing fonts with standard PDF fonts (a.k.a. base 14 fonts). These
fonts can be created using
pdftron.PDF.Font.Create(pdfdoc.GetSDFDoc(),pdftron.PDF.Font.StandardType1Font…).
According to PDF specification, PDF consumers are required to support
these standard fonts, even if the corresponding font is not installed
on the system.

Q: When I performed “Save As” in adobe this file was reduced to 61kb.
So optimizing the PDF after Font embedding should solve the size
issue.


A: It really depends on the number of referenced glyphs in the
document. If all glyphs are referenced then there would be no decrease
in file size. If most glyphs are not referenced (as is the case in
some Unicode fonts) then there would be a more significant decrease in
files size. PDFNet includes built-in font subsetting function (i.e.
the last flag in Font.Create???() methods) so it is possible process
existing PDF-s or generate new documents so that the file size is
comparable to the one produced by Acrobat Pro Optimizer.

Accidentally I run into another KB article that may be relevant
(please search for "Adding font data to PDF documents with missing
fonts" in this forum). With this approach the font data is simply
added to the existing PDF font dictionary and you don't need to create
a new font, swap objects, etc. Unfortunately, in this case PDFNet will
not subset the font, although you could use PDFNet to find all
referenced glyphs and to perform subsetting before embedding the
missing font data.