How to remove Tagged content from a PDF?


We are having some issues with a 3rd party tool and how it is failing to handle Tagged PDF files. How can we remove the Tagged data in hopes that the other tool will work properly?


There are 3 ways to remove the Tagged data, each one involving a larger modification of the PDF.

  1. Set the MarkInfo/Marked boolean to false, or just delete (as we do in this case, since default is False anyway).

Obj root = pdfDoc.GetRoot(); root.Erase("MarkInfo");

  1. Do (1) above, and then also delete the Logical Structure.

Obj root = pdfDoc.GetRoot(); pdfdoc.GetRoot().Erase("MarkInfo"); pdfdoc.GetRoot().Erase("StructTreeRoot");

  1. Do (1) and (2) above, and then use the ElementEdit sample code to read each pages content, and NOT write back the following types.

e_marked_content_begin e_marked_content_point e_marked_content_end

Full code available upon request.