Thanks for response.
I made some investigation:
pdf - 6.5 mb
fdf - 1.0 mb
xfdf - 3.8 mb
xfdf without ‘line’ annot - 0.03 mb
fdf looks better, but still the file size too big.
So, I decided write xfdf manually (or use own format):
for (PageIterator itr = doc.getPageIterator(); itr.hasNext(); ) {
Page page = (Page) (itr.next());
int num_annots = page.getNumAnnots();
for (int i = 0; i < num_annots; ++i) {
Annot annot = page.getAnnot(i);
if (!annot.isValid()) {
continue;
}
Obj sdf = annot.getSDFObj();
String subtype = sdf.get("Subtype").value().getName();
if (!subtype.equals("Link")) {
// TODO convert Annot to xfdf
}
}
}
Is there utility class that can help me convert Annot to xfdf format like this?
<highlight color="#FFFF00" opacity="1" creationdate="D:20160212092604Z00'00'" flags="print" date="D:20160212092604Z00'00'" page="0" coords="256.035004,524.870002,374.325004,524.870002,256.035004,
498.530002,374.325004,498.530002" rect="256.035004,498.530002,374.325004,524.870002" title="">
</highlight>
I didn’t find documentation that describes how I can extract required fields from Annot class for each type of annotation (I need only highlight, strikeout, underline, ink, text)
пʼятниця, 12 лютого 2016 р. 03:26:57 UTC+2 користувач Ryan написав:
This xfdf data means that each page has an annotation, that when clicked, takes the user to another page. In this case from page 3 to page 17 (xfdf is zero based page numbering).
You could preprocess the PDF and remove the link annotations.
There would need to be a lot of annotations for a 32bit process to run out of memory, but possible.
Is it possible for you to keep the annotations in FDF format (which is binary and much smaller)?