Q: We need to be able to blank out all the text on the same line after
a specifiable string in a PDF.
A: You can break the PDF redaction task in three parts:
1) Search PDF for a given text string.
2) Identify rectangular regions to blank out.
3) Hide or delete text content from given regions.
For PDF text search you could use TextExtractor class (e.g. as shown
inTextExtract sample project - http://www.pdftron.com/net/samplecode.html#TextExtract).
Given a bounding box for selected text (i.e. word.GetBBox() etc.) you
can determine the rectangle from which to clear the text. For example,
Rect(word_bbox.x2, word_bbox.y1, page.GetMediaBox().Width(),
The simplest way to hide text would be to draw a white rectangle on
top of the given region using ElementBuilder & ElementWriter (http://
www.pdftron.com/net/faq.html#how_watermark). The problem with this
approach is that the rectangle could be easily removed to uncover text
A better approach would be to edit the existing page to remove text
runs that intersect the given rectangle. This can be implemented as
shown in ElementEdit sample project (http://www.pdftron.com/net/
samplecode.html#ElementEdit). The sample program strips all images
from a given PDF page, however the code could be adopted for other
tasks. For example, the following article describes how to modify text
color underneath a given rectangle:
or search for "How can I modify text color under a given rectangle"
Removing the selected text should be as simple as commenting out
specific lines (i.e. writer.WriteElement(element)) in the referenced