Native to XOD Conversion Questions

Hello,

I had a few questions related to Native to XOD conversion.

  1. I’ve read that an XOD is just an XPS file that has been optimized for online viewing. So, is XOD a format specific to PDFTron? Is there any official documentation on what makes an XOD different from an XPS?
  2. For PDF conversion, I’ve read that for Office 2007 SP2 or higher documents, PDFNet will try to use Office interop, and if office is not installed, it will fall back to the virtual printer. Is this the same for XOD as well? And are there other formats that will attempt to use the native application via Interop if they are installed?
  3. Am I correct in assuming that for some formats (for example, Lotus Notes E-mail files) I need to have the native application installed on the machine with PDFNet in order for it to properly convert?
  4. Is the “XPS Print Path” used for XOD conversion for all formats, whenever possible? Is that just part of what is sent by the virtual print driver when it sends the “printto” command?

Thanks!

  • Keith Kaminski

what makes an XOD different from an XPS?

XOD is essentially a type of XPS (XML Paper Specification - http://en.wikipedia.org/wiki/Open_XML_Paper_Specification). The spec was originally developed by Microsoft but (similar to PDF) it was handed over to an international standards body (ECMA; download the spec here http://www.ecma-international.org/publications/standards/Ecma-388.htm). Open XML, which is foundation of XPS, is also currently an ISO standard.

These days you can create XPS files from any Windows program (via Print to XPS). In some ways XOD is just a subset of XPS. You can rename XOD to XPS and view it locally on any Windows machine (or use .NET framework for viewing or manipulating it in your app - http://msdn.microsoft.com/en-us/library/ms748388(v=vs.110).aspx).

To guarantee fast viewing across mobile devices XOD does not allow some XPS features (e.g. TIFF, HDPhoto, file interleaving, etc). At the same time XOD organizes file in a way that it can be efficiently viewed without need to download the entire file (this is similar to PDF linearization http://blog.pdftron.com/2013/08/24/streaming-a-pdf-from-the-web/)

XOD also includes optional Unicode representation of all text in the document (which could be used for client and server side document indexing, search, and in the near future reflow …).

  1. For PDF conversion, I’ve read that for Office 2007 SP2 or higher documents, PDFNet
    will try to use Office interop, and if office is not installed, it will fall back to the virtual printer. Is this the same for XOD as well?

Yes, everything that applies to PDF also applies to XOD. PDFNet can directly normalize PDF and XPS to XOD. So anything that can convert to PDF/XPS can be used to power your WebViewer solution.

PDFNet supports a few direct PDF/XPS/XOD converters (e.g. HTML to PDF, EMF, image, etc). To add viewing support for another document format that is not directly supported (say DWG) you would need to find a program that can convert the file to PDF or XPS (a quick search on the web shows many affordable utilities) or can silently ‘print’ the file (e.g. most ‘DWG’ viewers can do that). In the latter case you would associate the file extension (i.e. DWF) with the print utility and PDFNet would capture the print output as PDF/XPS/XOD.

In case of MS Office, PDFNet will use COM interop (if MS Office is available) to produce high-quality PDF/XPS/XOD output. For Office files, the currently recommended approach is MS Office Interop (compared to Virtual Print Driver route). For more info please see: http://goo.gl/3b4iw4.

  1. Am I correct in assuming that for some formats (for example, Lotus Notes E-mail files) I need
    to have the native application installed on the machine with PDFNet in order for it to properly convert?

WebViewer (and PDFNet) support a few direct conversions, however it is technically not feasible to support every conceivable format (and other solutions that attempted to support hundreds or built-in conversion always failed). Instead you can extends the range or supported formats in your own solution by integrating with the apps/tools/cli utilities/sdk-s that can export to PDF/XPS or that can silently print the file. So if you want to support Lotus Notes E-mail files, you would need to find a tool that can either convert these to PDF/XPS or that can print.

  1. Is the “XPS Print Path” used for XOD conversion for all formats, …

Not sure if I understand the question. The Virtual Print Driver is using “XPS Print Path” which basically captures print commands as XPD file. XOD conversion just simplifies and linearizes XPS so that it can be efficiently viewed on the client side.

Thanks for the replies. This was very helpful. I just have a couple of followup questions:

  1. If we assume for a moment that I have a vanilla installation of Windows Server 2008 R2 that is converting documents to XOD with PDFNet, what formats would be supported without the need for any additional software? Is there a full list I can find somewhere?
  2. Assume I have some made up file format with the extension .KLK. If I wanted the Virtual Print Driver to be able to convert KLK documents to XOD, would it be sufficient for me to have a program on my machine that is associated with KLK and had the ability to print? I guess I’m not sure what it meant above when you said “associate the file extension with the print utility”.

Thanks!

  1. assume a vanilla installation of Windows Server 2008 R2 … what formats would be supported without the need for any additional software? Is there a full list I can find somewhere?

For a list built-in conversions see: http://www.pdftron.com/pdfnet/addons.html#Convert

Also since DOCX can be viewed in WordPad on a vanilla installation Windows Server 2008 R2, you would also be able to convert DOCX via virtual print driver. P.S. At the same time, this may not be the recommended approach (using MS Office would support doc and other office files and would be more reliable). Btw. we are also planning to offer direct Office conversion, however at this point we can’t provide more info re: timelines.

  1. Assume I have some made up file format with the extension .KLK. If I wanted the
    Virtual Print Driver to be able to convert KLK documents to XOD, would it be sufficient
    for me to have a program on my machine that is associated with KLK and had the ability to print?

Yes, this would be sufficient. When your request a conversion PDFNet would check what is the program associated with KLK extension (i.e. what is the default program that will open the file) and will instruct the program to silently print the file (via ‘print’ verb). All of this is just standard way Windows deals with file extensions and default apps. KLK Viewer/printer app would then ‘print’ the file va PDFNet Virtual Printer Driver and you would end up with PDF/XPS or XOD.

Thanks again for the replies! I just have one more question. Lets assume for a moment that I have a Word document that I convert to XOD for display purposes. I then allow my user to redact the Word document by drawing a bunch of black rectangles over certain parts of it, and then I store the resulting annotation data. From what I can tell, I have the ability to create a PDF of that Word document using PDFNet that includes those redactions. My question is whether those redactions are a “layer” on top of the PDF - or if they are an actual part of the PDF so that there is no way to see the content underneath.

‘pdftron.PDF.Redactor’ removes (burns) parts of PDF content. For example parts of images under redacted are will be cleared, vector graphics and text may be removed or edited to fit the redaction region. So, redaction process is very different from adding content on top of PDF page (e.g. stamping). You could use PDFNet for stamping as well (pdftron.PDF.Stamper) however it is not secure because one could always remove top layers to reval content underneath.