Text Search which is adjacent left OR right to the selected text

Product: pdftron/webviewer using SDK angular

Product Version: @pdftron/webviewer": "^8.3.0

Please give a brief summary of your issue:
Considering a structure in the pdf as shown below

When I select the text which is number “50” in this case I need to pick up the value it is associated for which is “Common Stock”. It should give me the Common Stock along with the selected number 50.

Please describe your issue and provide steps to reproduce it:
I tried below options on textselected event. I am able to select 50 but not “Common stock”

**

Please find my comments below

**
I tried

       const doc = await documentViewer.getDocument().getPDFDoc();
         const page = await doc.getPage(pageNum);
         const txt = await PDFNet.TextExtractor.create();
          txt.begin(page);

          let text_T: string;
          let line;         
         
        //IN this case it give me data which but the extracted text is does not have 50 adjacent to it 
        // It has next line below it 
          text_T = await txt.getAsText();
          console.log(text_T);

        // I also tried with this but both required piece of information are on different paragraph.. No hint to identify that what 50 is associated with
         text_T = await txt.getAsXML(PDFNet.TextExtractor.XMLOutputFlags.e_words_as_elements | PDFNet.TextExtractor.XMLOutputFlags.e_output_bbox | PDFNet.TextExtractor.XMLOutputFlags.e_output_style_info);
      

Could you please let me know how can we achieve this?

Hello sameerghorpade,

  1. Can you provide the file here for testing? if you are unable to post if publicly I will provide my email address below
  2. In WebViewer, when you select ‘Common stock’ and drag to the other side of the page, does the 50 get selected as well?
  3. We have a couple of Text Extraction demos, you can find here:
    a) PDFTron
    b) PDF Text Extractor Tool | Online Demo | PDFTron WebViewer

Best regards,
Tyler Gordon
Web Development Support Engineer
tgordon@pdftron.com

Hello,

apology, could not provide you the pdf. the requirement here is when you double click with your mouse on number 50 only. the textselected event fire’s up. All good up to this. but at the same time programmatically you also need “Common stock” to be extracted.

Regards,
Sameer

Hello sameerghorpade,

Unfortunately its difficult to test how our OCR works on a document when I cant test it locally, however we have documentation on text extraction in the guides I provided previously.

Here is another guide on extracting selected text: PDFTron
Selecting text with coordinates: PDFTron and How to programmatically extract text within a given rectangle (x, y coordinates)?

Best regards,
Tyler Gordon
Web Development Support Engineer
PDFTron