MS Word to PDF Conversion in Java

We want to convert .doc and .docx format into PDF programmatically in Java. The Direct MS Office Conversion Add-On says "convert MS Word to PDF and other formats on any platform without using Microsoft Office", but in the code sample (https://www.pdftron.com/pdfnet/samplecode/ConvertTest.java.html) it says "requires MS Word and PDFNet printer installed". So, which one is correct?

Thanks,

Hi thank you for pointing this out. We will look to improve this for a future release.

There is a specific WordToPDFTest sample project for direct conversions.
https://www.pdftron.com/pdfnet/samplecode/WordToPDFTest.java.html

Our direct conversion of docx to pdf provides more control over the process then our “indirect” conversion, and so an alternate API is provided.

Alternatively, you can modify the convert sample to the following, which will use our direct docx to pdf conversion.

// DISABLE check for printer, is not required for docx conversion using builtin. //if( Convert.requiresPrinter( input_path + "simple-word_2007.docx") && printerInstalled ) { // ENABLE builtin converter ConvertPrinter.setMode(ConvertPrinter.e_convert_printer_prefer_builtin_converter); // rest is the same System.out.println("Converting MS Word document to PDF"); System.out.println("Using the PDFTron PDFNet printer"); PDFDoc doc = new PDFDoc(); Convert.toPdf(doc, input_path + "simple-word_2007.docx"); outputFile = output_path + "docx2pdf_Java.pdf"; doc.save(outputFile, SDFDoc.e_linearized, null); System.out.println("Result saved in " + outputFile); doc.close(); }

Thanks Ryan. Is the WordToPdfTest sample just for docx format? For me it fails on .doc files but the other approach works for both formats.

What is the difference between ‘direct’ and ‘indirect’ conversions? Does ‘direct’ conversion require MS Office installed?

Yes, currently the only Office format we support for direct conversion is docx. Other formats, such as excel, powerpoint and .doc are in progress.

For general conversion, use ToPDF as it will handle as many types as possible, using the best option available.

To get notified about the next release you can join our announcement google group
https://groups.google.com/forum/#!forum/pdfnet-sdk-announce

Or register for our RSS feed
https://www.pdftron.com/pdfnet/whatsnew.html
We are also on Twitter
https://twitter.com/pdftron

Great question!

By direct we mean everything is done by PDFNet, with no dependency on anything else.

So indirect office conversion requires a 3rd party application that can print office documents. Ideally MS Office, but libre office is a possibility. Technically, on Windows OS, you can convert any printable file format to PDF using a PDFNet and an application that can print that file format to a printer, such as CAD drawings.

Direct conversion on the other hand, and PDFNet supports many, including image formats, is done entirely in PDFNet, and therefore works on any platform.

So using our new docx direct conversion, you can use PDFNet to view docx files on Android for instance. Or do docx conversions on cheaper Linux servers.

Our SDK Features page gives more details
https://www.pdftron.com/pdfnet/addons.html#Direct

Thanks for clarification. So, currently there is no ‘direct’ convert for .doc format? I tried the following on a system with no MS Office or PDFNet Printer installed and it failed with “Unable to find printer” error:

ConvertPrinter.setMode(ConvertPrinter.e_convert_printer_prefer_builtin_converter);

PDFDoc pdfdoc = new PDFDoc();

// perform the conversion with no optional parameters
Convert.toPdf(pdfdoc, INPUT_FOLDER + docName);

// save the result
pdfdoc.save(OUTPUT_FOLDER + docName + “.pdf”, SDFDoc.e_linearized, null);

How indirect conversion works, and for Office documents this is triggered if the appropriate office application is not installed, is that PDFNet sends a print command to the OS to print that target file to PDFNet’s virtual printer. If there is a 3rd party application registered with the OS to print that type, then it prints to the virtual printer. PDFNet then converts this output to PDF.

The links earlier in this thread point to forum posts showing how to add a program as the default for printing.

Using this combination any printable format can be converted to PDF (or XOD for Universal WebViewer) on Windows OS.

Hey Ryan

I am using developing an Android app that also requires a conversion of .docx, text files etc. I am currently using the demo version of the SDK. Can I perform such functionalities?

If yes, how do I get the ConvertPrinter Class because?

There are multiple ways to convert MS Office to PDF using PDFNet.

Assuming that you do not want to convert via virtual driver or office interop you can use direct converter as shown in Word2Pdf sample:
https://www.pdftron.com/pdfnet/samplecode.html#Word2Pdf

Please let me know if this helps.