How do I embed and extract TIFF in PDF without loosing any information?

Q:

I need to include tiff files into pdf file and in a second time I
would extract these images from the pdf file.

This works well but when extracted, the definition of my tiff image
has changed (the DPI attribute is not set properly).

I need to extract tiff image exactly as same as tiff images I had
included. ( using method ExportAsTiff(string) ? )

The following is C# code which I use to extract pdf images:

PDFNet.Initialize();
PDFDoc one_pdf = new PDFDoc(tb_pdf.Text);
PageIterator page_end = one_pdf.PageEnd();
PageIterator page_begin = one_pdf.PageBegin();
ElementReader pdfreader = new ElementReader();
PageIterator itr,itr2;
for (itr = page_begin; itr != page_end; itr.Next()) {
  pdfreader.Begin(itr.Current());
  Element pdfelement;
  while ((pdfelement = pdfreader.Next()) != null) {
    switch (pdfelement.GetType()) {
        case Element.Type.e_image: {
           string fname = [my_output files path] + image_counter;
           pdftron.PDF.Image image = new
pdftron.PDF.Image(pdfelement.GetXObject());
           image.ExportAsTiff(fname+".tif");
           image_counter++;
           break;
        }
    }
}
pdfreader.End();
}
one_pdf.Close();
PDFNet.Terminate();
------
A:

The problem is that PDF format does not store DPI for embedded images,
so PDFNet can't initialize this attribute.

There are couple of ways you can go around this problem. You can store
the DPI parameter when creating the image (using a custom key/value
pair). After extracting the TIF you can use this information to set
the correct DPI value (e.g. using standard .NET API).

On PDF generation side you can associate a custom attribute with the
image as follows:

pdfimage.GetSDFObj().Put("_MYDPI", Obj.CreateNumber(200));

on the PDF input side you can extract the custom attribute associated
with the image as follows:

Obj dpi = pdfimage.FindObj("_MYDPI");
if (dpi != null) {
  double mydpi = dpi.GetNumber();
  ... use this value to update the DPI in extracted TIFF
}

Another approach is that you embed the original TIFF file and
associate it with the PDF copy of the image. Instead of exporting the
PDF image you would extract the embedded TIFF.

On PDF generation side:

pdftron.Filters.StdFile embed_file = new
pdftron.Filters.StdFile("my.tif", StdFile.OpenMode.e_read_mode);
pdftron.Filters.FilterReader mystm = new FilterReader(embed_file);

pdfimage.GetSDFObj().Put("_MYTIFF",
pdfdoc.CreateIndirectStream(mystm));

on the PDF input side:

Obj embedded_tiff = pdfimage.FindObj("_MYTIFF");
if (embedded_tiff != null) {
  double stm = embedded_tiff.GetDecodedStream();
  FilterReader reader = new FilterReader(stm);
  reader.Read(...);
  ... extract embedded TIF ...
}

Q:

Thank you for your help. I understand my problem but I have another
one.
I'm working on your second solution to extract embedded TIFF file.

I understood your method but can you tell how to you mean by " ...
extract embedded TIF ... " ? I've tried several things like
FilterWriter method but I am not sure how to use the API.
-------
A:

The pseudocode to extract an embedded stream may look as follows:

Obj embedded_tiff = pdfimage.FindObj("_MYTIFF");
if (embedded_tiff != null) {
  // extract embedded TIF ...
  Obj stm = embedded_tiff.GetDecodedStream();
  FilterReader reader = new FilterReader(stm);

  StdFile out_file = new StdFile("my.tif",
StdFile.OpenMode.e_write_mode);
  FilterWriter writer = new FilteWriter(out_file);
  writer.WriteFilter(reader);
  writer.Flush();
  out_file.Close();
}

is there any way to just extract an embedded image with exactly
the same format, the same size, the same resolution.

You can extract the image with the same pixel dimensions (e.g.
2000x2000 pixels) and the color format, however you can't get the DPI,
because PDF format does not store the DPI.

Or why when I extract image with ExportAsTiff(file) method,
my extracted image have 100 dpi resolution.

This is the default resolution that we assign to all extracted images.
It can be any other value (e.g. 1000DPI or 0 DPI).

If you have input images, you can store the custom attribute or embed
the file as described above, but if you don't have the original image
there is no way to figure the original DPI with certainty.

Q:

Perhaps PDF Format does not store the DPI of the image.
But is it possible that PDF file have one DPI attribute?
At last, is it possible to change the default resolution of extracted
images?
---

A:

PDF files are by definition 'resolution independent', meaning that
that they can be printed at any DPI. As a result, there is no single
DPI attribute associated with the entire document/page.

You can definitely change the default resolution on extracted images.
For example, if you use GetBitmap() method to obtain the Bitmap, use
SetResolution() method on the System.Drawing.Bitmap object to set the
desired resolution. Similarly you can open exported TIFF images (e.g.
using libtiff or .NET API) and modify the required resolution.