Please give a brief summary of your issue:
Replace whole paragraph while maintaining text style of words
Please describe your issue and provide steps to reproduce it:
I need to replace certain words from a PDF in a nodejs backend. I initially tried using the addString method of content replacer to replace it directly. However, I encountered a problem that when the new word is longer than the replaced word, the line can go beyond the page layout.
To solve this, I extracted the whole paragraph of the relevant word, created new text and replaced the whole paragraph with the addText function. Works well except that the style of the special words (bold, italic) are lost. So, I kept a record of the word index and after the paragraph replacement, went back, put a white rectangle on top the replaced word, and placed a word with the new style on top of it.
The problem is I communicate with frontend client that allows it to replace content multiple times. So, when I try to extract the text again, duplicate words show up for the special words. I am creating the paragraphs by looking at the difference in height between lines and putting them in the same paragraph (using paragraph id gave me duplicate paragraph with ____, ____ values).
I tried using the e_remove_hidden_text flag for text extractor with no success to remove the duplicate text.
How can I replace words (or paras) so that the replaced content is contained in the layout, style of words is preserved, and duplicate text is removed? Either a solution to my problem or a different approach will be appreciated. Thanks!
Thank you for contacting us about this. You mentioned the following:
Just to clarify, what sort of rectangle did you place here? If you could provide us with an example document along with the steps you take in code (perhaps a working sample project) that would help us understand this issue a bit better.
Hi, apologies for the delayed response. I am simply drawing a white background rectangle on top of the special word before writing a new word with the original style and finally replacing the word again (as not doing so seems to give me unreliable result). Here is the relevant part of my code (typescript). Specialword is the word corresponding word I record earlier with its style and font.