Product Version: 9.2.0-1
Please give a brief summary of your issue:
Replace whole paragraph while maintaining text style of words
Please describe your issue and provide steps to reproduce it:
I need to replace certain words from a PDF in a nodejs backend. I initially tried using the addString method of content replacer to replace it directly. However, I encountered a problem that when the new word is longer than the replaced word, the line can go beyond the page layout.
To solve this, I extracted the whole paragraph of the relevant word, created new text and replaced the whole paragraph with the addText function. Works well except that the style of the special words (bold, italic) are lost. So, I kept a record of the word index and after the paragraph replacement, went back, put a white rectangle on top the replaced word, and placed a word with the new style on top of it.
The problem is I communicate with frontend client that allows it to replace content multiple times. So, when I try to extract the text again, duplicate words show up for the special words. I am creating the paragraphs by looking at the difference in height between lines and putting them in the same paragraph (using paragraph id gave me duplicate paragraph with ____, ____ values).
I tried using the e_remove_hidden_text flag for text extractor with no success to remove the duplicate text.
How can I replace words (or paras) so that the replaced content is contained in the layout, style of words is preserved, and duplicate text is removed? Either a solution to my problem or a different approach will be appreciated. Thanks!