Q: We are using PDFTron PDFNet API to extract images from PDF. When we
extract images using PDFTron, the image resolution is not the same as
when we use Illustrator. Also, transparencies are not fully maintained
during the export. Is there something wrong with our code?
Function ProcessImages(ByVal page As Page, ByRef dPage As dsPage) As
String
Dim reader As ElementReader = New ElementReader
reader.Begin(page)
Dim element As Element = reader.Next()
Dim xml As String = ""
While Not IsNothing(element)
If element.GetType() = element.Type.e_image Or
element.GetType() = element.Type.e_inline_image Then
Dim ctm As Matrix2D = element.GetCTM()
Dim x2 As Double = 1
Dim y2 As Double = 1
ctm.Mult(x2, y2)
If element.GetType() = element.Type.e_image Then
Dim dImage As New dsImage()
dImage.ImageGUID = Guid.NewGuid()
Dim fname As String = output_path +
"image_extract1_" + imageCounter.ToString() + "_" +
dImage.ImageGUID.ToString() + ".png"
Dim image As pdftron.PDF.Image = New
pdftron.PDF.Image(element.GetXObject())
Dim bmp As System.Drawing.Bitmap =
image.GetBitmap()
bmp.MakeTransparent()
bmp.Save(fname)
End If
End If
element = reader.Next()
End While
Return ""
End Function
----------------------
A: The PDFNet function you used extracts the image without any changes
to its native resolution (so any resolution diffrence is due to
Illustrator settings).
In terms of transparency, in PDF an image can be associated with
another image (or even vector artwork) as a soft (i.e. alpha)
channel.
You can use the API to extract all information required to recreate a
soft channel for the embedded image, however it is fairly involved.
Do you need to extract the embedded images as they are stored/embedded
in PDF or you would be fine with a rendered PDF page? In the latter
case you could use ‘pdftron.PDF.PDFDraw’ (as shown in PDFDraw sample
project - http://www.pdftron.com/pdfnet/samplecode.html#PDFDraw) to
generate an image from a PDF page. If you only need to render only the
area covered by an image, you could crop the page using image bounding
box before using PDFDraw.