How to optimise large PDF Files ?

Laurent_Biancardini · February 4, 2020, 9:46am

In Webviewer 6.0.4, I’m trying to optimise a large PDF File through 2 methods but can’t seem to make them work. What am I doing wrong please ?

1 - Linearization method

I use the linearization method on client side as follows (link: https://www.pdftron.com/documentation/web/guides/get-file-data-with-viewer/ ) , but the output doesn’t give me a linearized document.

`

var doc = viewerInstance.docViewer.getDocument();
doc.getPDFDoc().then(function() {
annotManager.exportAnnotations().then(xfdf => {
var options = {
xfdf,
flags: viewerInstance.CoreControls.SaveOptions.LINEARIZED,
downloadType: ‘pdf’,
flatten: true
};

doc.getFileData(options).then(function(docBuf){

var arr = new Uint8Array(docBuf);

var blob = new Blob([arr], { type: ‘application/pdf’ });
const url = URL.createObjectURL(blob);
viewerInstance.loadDocument(url);
});
});
});

`

2 - Optimize viewer method

I try to use the optimize viewer method (link: https://www.pdftron.com/documentation/web/guides/viewer-optimized-pdf?searchTerm=optimize) but can’t make it work. If I store the file on web storage firebase (Azure, Firebase, Amazon …) what should I put in outputPath ?

`

var doc = viewerInstance.docViewer.getDocument();
doc.getPDFDoc().then(function(pdfDoc) {
var options = {

thumbnailSize : 1024,
thumbnailRenderingThreshold: 40
}
pdfDoc.saveViewerOptimized(’/output/path’, options);
});

}

`

Thanks in advance for your help.

Matt_Parizeau · February 10, 2020, 7:47pm

Hi,

Are you having trouble getting the output PDF or you’re able to get the PDF but it doesn’t appear to be linearized? I tried with a ~150MB PDF using the code you posted and it appears to be working for me, the output PDF was linearized.How did you verify that the output wasn’t linearized? Would you be able to share an example PDF that you loaded and the output is not linearized?
I took a look into this and it appears the saveViewerOptimized is not intended to be exposed yet on the Web JavaScript side. We will update the code and API docs so this isn’t misleading. You can currently only use this API on the server side.

Matt Parizeau
Software Developer
PDFTron Systems Inc.

Laurent_Biancardini · February 11, 2020, 8:49am

HI Mathew,

Thanks for your answer.

I’m getting an output PDF but it’s not linearized. I tested it on displaying the output PDF on Webviewer with the “viewerInstance.loadDocument(url);” and then downloading the PDF file to my computer. Opening the file on Adobe Acrobat DC, it shows that the file is not fast web view compatible. Also I used the PDFNet.isLinearized method after loading the output PDF and the value is false. How do you test your file please ?
OK it’s clear.

Thanks,

Matt_Parizeau · February 13, 2020, 1:51am

Hi,

When you download the PDF are you pressing the download button in the UI? This will call getFileData but it doesn’t use the linearized option when saving.

How I verified is to run the code that you used originally with getFileData and the SaveOptions.LINEARIZED flag and then open the blob URL in a new tab and download that from Chrome’s default PDF viewer.

Matt Parizeau
Software Developer
PDFTron Systems Inc.