I am using PDFNet to extract PDF file to detect images data. But now I only can get position and size of images. How can I get alt text of images ?. Thank you
code:
const doc = await PDFNet.PDFDoc.createFromFilePath(pdfPath);
await doc.initSecurityHandler();
const doc_fields = await doc.fdfExtract(PDFNet.PDFDoc.ExtractFlag.e_both);
// Export annotations from FDF to XFDF.
const xfdf_data = await doc_fields.saveAsXFDFAsString();
console.log("xfdf_data", xfdf_data);
// await doc.lock();
const page = await doc.getPage(1);
if (page.id === "0") {
console.log("Page not found.");
return 1;
}
const reader = await PDFNet.ElementReader.create();
await reader.beginOnPage(page);
for (
let element = await reader.next();
element !== null;
element = await reader.next()
) {
const type = await element.getType();
if (
[
PDFNet.Element.Type.e_image,
PDFNet.Element.Type.e_inline_image,
].includes(type)
) {
const bbox = await element.getBBox();
const imageData = await element.getImageData();
console.log("image", bbox, "imageData:", imageData);
continue;
}
}
the current result from code above:
image {
name: 'Rect',
y2: 690.72,
x1: 72,
mp_rect: '0',
x2: 334.97999999999996,
y1: 427.68
} imageData: { name: 'Filter', id: '1883eaadff0' }