[PDFNet] Question concerning merge/conversion to pdf/a-1b

Hello,

we are using the PDFNet SDK(Java SDK) for a while and we 've encountered some strange behaviour:
We are merging and converting files to PDF/A-1b and while doing that we encountered the following:
We 're merging PDF files pagewise using the following code snippet

---snip---
PDFDoc doc = new PDFDoc("test/resources/pdf/regression/d-1273046479615.pdf");
         PageIterator it = doc.getPageIterator();
         for (int i = 1; it.hasNext(); ++i) {
             Page page = (Page) it.next();

              //append the page to some doc...
         }

---snap---

The overall document that we want to merge has a size of roughly 11 MB. If we save a single page of this document, it is 10 MB big. This occurs to every page. In our business process we convert the merged PDF files in PDF/A-1b compliant files. After merging all pages the resulting document has a size 350 MB. Unfortunately, I could not attach this page to this mail due to its size.

Usually, the conversion is only a matter of seconds. However, this particular document takes minutes to convert (if ever successful). I attached a splitted page of the document and the conversion messages.

Do you have any idea, what might cause this explosion in file space?
And is there any possibility to avoid this (perhaps by configuration)?

Thanks and best regards,
Mathias Peters
--
-------------------------------------------------------
Mathias Peters
iSquare GmbH
Entwicklung
Saarbrücker Straße 36
D-10405 Berlin

Tel.: 030 / 44 35 09-21
Fax.: 030 / 44 35 09-29
Mail: mathias.peters@isquare.de
Web: www.iSquare.de
-------------------------------------------------------
TÜV zertifiziert nach ISO 9001:2008
-------------------------------------------------------
isquare GmbH, Berlin
AG Charlottenburg HRB 77436
Geschäftsführer: Michael Kapst,
Andreas Koch, Siegfried Ballmann
-------------------------------------------------------
Die Informationen dieser Mitteilung und ihrer Anlage sind vertraulich und nur für ihren oben genannten Empfänger bestimmt. Unbefugtes Weiterleiten, Veröffentlichen, Kopieren usw. sind untersagt. Sollte diese E-Mail nicht für Sie bestimmt sein, löschen Sie es bitte umgehend aus Ihrem System und kontaktieren den Absender.

This e-mail and any attachment contains information which is private and confidential and is intended for the addressee only. If you are not an addressee, you are not authorised to read, copy or use the e-mail or any attachment. If you have received this e-mail in error, please notify the sender by return e-mail and then delete it.
-------------------------------------------------------

--
You received this message because you are subscribed to the "PDFTron PDFNet SDK" group. To post to this group, send email to support@pdftron.com
To unsubscribe from this group, send email to pdfnet-sdk-unsubscribe@googlegroups.com. For more information, please visit us at http://www.pdftron.com

It seems that the problem is that you are directly using
PagePushBack() to copy pages to the destination PDF. Since PDF pages
may share fonts/images/etc this can result in duplication of resources
and increase in file size. To go around this simply import all pages
in the destination document in one swoop as shown in the last code
sample (sample 6) in PDF page sample project:

--
You received this message because you are subscribed to the "PDFTron PDFNet SDK" group. To post to this group, send email to support@pdftron.com
To unsubscribe from this group, send email to pdfnet-sdk-unsubscribe@googlegroups.com. For more information, please visit us at http://www.pdftron.com