PDF to Powerpoint Converter

savinskiy.konstantin · December 28, 2021, 1:46pm

Do you have SDK methods for converting PDF files to Powerpoint?
Like this example PDF to Powerpoint Converter: Free Online Conversion | PDF Online
Do you have tools for getting info about text blocks that we can see after converting to Powerpoint? For example, I need to get info about text paragraphs, bullet and numbered lists, tables, etc. without creating a Powerpoint file.
Converting to Powerpoint provides much better results than Converting to HTML or Text Extraction

Thanks

Ryan · December 29, 2021, 6:26pm

Do you have SDK methods for converting PDF files to Powerpoint?

Yes, the new API that pdf.online is using will be released in upcoming PDFTron SDK release early next year. To get notified of the release you can subscribe here.

Do you have tools for getting info about text blocks that we can see after converting to Powerpoint?
I need to get info about text paragraphs, bullet and numbered lists, tables, etc. without creating a Powerpoint file.

It would be best if you could elaborate on why getting info about text paragraphs/bullets/etc. is important for you?
When you say “info” what do you mean exactly? X/Y position (relative to what)?
What do you do with this information once you have it? How does it help you or your users with this info?

Once I know your overall objective then I can assist you best.

savinskiy.konstantin · December 30, 2021, 7:36am

Thanks for the answers.

Our goal is converting pdf to specific HTML. And we need a tool that can detect text paragraphs, tables, bullets and numbered lists, etc. You have tools for converting PDF to HTML or Text Extraction. But the result isn’t the same as what I can see in Powerpoint.

When I say “info” it means the ability to get, for example, JSON with all information about shapes from Powerpoint. We don’t need a ppt file but we need info which contains in ppt. For example, absolute positions of shapes, width and height, paragraphs info (margins, line height, list options, etc), text runs content and font properties (size, weight, font family, color and other).

An additional issue - getting original fonts or font families from. For example, during converting to HTML is used right fonts. I know that these fonts don’t contain all glyphs. Can we have the same fonts in Powerpoint? Or if we say about getting info, add to text runs original font families.

Thank you.

Ryan · January 20, 2022, 4:32pm

Our goal is converting pdf to specific HTML.
Yes, but why do want to do this?
Why not operate on the PDF itself?
How does having Powerpoint and/HTML help you exactly?
What do you do with the HTML/Powerpoint output?
If you prefer the Powerpoint output over the HTML output, then why not use that?

absolute positions of shapes, width and height, paragraphs info (margins, line height, list options, etc), text runs content and font properties (size, weight, font family, color and other).
Why is this info important for you?
What do you do with this info?

The better I understand your overall objective/requirements the best I can assist you.

savinskiy.konstantin · January 24, 2022, 7:44am

We create a tool that allows users to convert pdf to HTML with the ability to edit converted HTML then and publish it as the website. For user convenience for editing, we want to provide HTML that will contain correct text paragraphs, lists, tables, etc. So we need all info about text positions, font properties, etc. Even if we can’t embed a font to a page automatically we want to provide for using correct font family and users can upload the needed fonts.
And when I say about Powerpoint I mean that your conversion to PowerPoint is much better than to HTML in the context of grouping non-related text from pdf to text blocks (paragraphs, tables, lists).
So I’m interested in the ability to get info about these text blocks without creating ppt file.

Ryan · February 10, 2022, 7:54pm

Please note that PDFNet 9.2 is out, and our PDF to PowerPoint and our PDF to reflowable HTML are both powered by the same Module (from the Solid Documents team).

Please try out the new 9.2 SDK and the new Structured Output Module.

If you are still looking for assistance I think the next step would be schedule a call with one of our solution engineers, which you can do here. PDFTron