Will PDFNet SDK with Redactor (https://www.pdftron.com/pdfnet/docs/PDFNetC/d1/d02/classpdftron_1_1_p_d_f_1_1_redactor.html)allowallow) the users to do a Batch Redaction? For example, if someone wanted to redact a certain area for every page (on lets say a 100 page PDF file), can this be done without redacting every page 1 by 1?
If the redacted page dimensions are identical for each page, then redacting the same area on each page is trivial.
Here, for example, is our sample code for performing redactions on various pages of a PDF document:
To perform the same redaction on each page would be as simple as:
PDFDoc doc((input_path + “newsletter.pdf”).c_str());
int page_num = doc.GetPageCount();
for (int i=1; i<=page_num; ++i)
Rect(100, 100, 550, 600), false, “Top Secret”));
app.RedactionOverlay = true;
app.Border = false;
app.ShowRedactedContentRegions = true;
Redact(input_path + “newsletter.pdf”,
output_path + “redacted.pdf”, vec, app);
The redactor also supports concept of ‘negative’ redactions and this may be relevant to your batch processing mode. A document based negative redaction expand beyond the single page to automatically remove content from other pages in the document.
In the sample
https://www.pdftron.com/pdfnet/samplecode.html#PDFRedact, it looks like we are redacting based on the region? Is there a way to redact based on a text found in the PDF?
You can perform a text search to find the bounding boxes of text matching a regular expression, as shown in the TextSearch sample code:
Once you have the bounding boxes, you can pass those in to Redactor as the coordinates of the redacted region.
Alternatively you could use TextExtractor and do your own search/text processing.
Does Redactor just place a box over the text OR does it permanently replace the text with the
It is important that the user cannot use any sort of PDF annotation program to remove the redacted box and see the text EVER after the redaction is complete.
The Redactor completely removes the redacted region from the PDF content, including text, vector content, images, and annotations. It does NOT simply add an annotation or image mask over the content. Once PDFNet redacts the content, the content is erased from the document.
Which sample code would allow the user to highlight the bounding box on a PDF page so that I can pass to the Redactor
For an interactive sample which is using PDFViewCtrl see: