Using PDFRasterizer compared to PDFDraw.Export on the same PDF page.

Q: I am trying to rasterize a PDF page to a buffer which I then read
into my application. I am using the following C++ code:

   float drawing_scale = 1:
   Common::Matrix2D mtx(drawing_scale, 0, 0, drawing_scale, 0, 0);
   PDF::Rect bbox(page.GetMediaBox());
   bbox.Normalize();
   int width = int(bbox.Width() * drawing_scale);
   int height = int(bbox.Height() * drawing_scale);

   // Stride is represented in bytes and is aligned on 4 byte
   // boundary so that you can render directly to GDI bitmap.
   // A negative value for stride can be used to flip the image
   // upside down.
   int comps = 4; // for BGRA
   int stride = ((width * comps + 3) / 4) * 4;

   // buf is a memory buffer containing at least (stride*height)
bytes.
   memset(ptr, 0xFF, height*stride); // Clear the background to
opague white paper color.

   PDFRasterizer rast;
   rast.SetRasterizerType(PDF::PDFRasterizer::e_GDIPlus);
   rast.Rasterize(page, buf, width, height, stride, mtx);

Unfortunately when I read buf into my application it looks poor.

For debugging I did the following:

    PDF::PDFDraw draw2;
    draw2.SetRasterizerType(PDF::PDFRasterizer::e_GDIPlus);
    draw2.Export(page, "u:\\test.png");

When I read test.png into my app it looks much better. Certainly the
algorithms to read the two into my application are very different.

The dimensions of the png in test.png are 2024 x 1564, however
page.GetMediaBox() reports its
dimensions as 1584 x 1224 (so I assume that buf contains a 1584 x 1224
image ).

I am confused why the two dimensions are so different?
--------
A: The method 'page.GetMediaBox()' (and other Page methods) return the
dimensions in device-independent units called points (1 pt = 1/72
inch).

To compute the pixel dimensions in the 'device space' you are using
'mtx' transform (which is in your case the identity matrix). If you
would like to produce a larger bitmap you could increase
'drawing_scale' factor (e.g. to 1.2) and to produce the smaller bitmap
you could decrease 'drawing_scale' factor (e.g. drawing_scale = 0.5;).

The process used to rasterize a PDF page is the same in both PDFDraw
and PDFRasterizer, so any differences in output are due to differences
in parameters. Actually PDFDraw is a simple wrapper around
PDFRasterizer that makes it simpler and more intuitive to render the
pages.

For example,

PDF::PDFDraw pdfdraw;
pdfdraw.Export(page, "test.png");

would render the given page at 92 DPI (which is the default value). To
produce the same output as your PDFRasrterizer based code call
pdfdraw.SetDPI(72) before pdfdraw.Export(...).

In case you prefer to work in pixel units, you could use
pdfdraw.SetImageSize(with, height) instead of pdfdraw.SetDPI(). For
example:

double pg_ratio= page.GetPageWidth()/page.GetPageHeight();
int width = 1000, height = (int)(pg_ratio * pg_ratio);
pdfdraw.SetImageSize(width, height);

Also, we typically recommend the use of a 'built-in' rasterizer (the
default) instead of GDI+ because Windows GDI graphics model is not
fully compatible with PDF.