Dealing with parent resources when extracting content using ElementReader


I encountered issue when extracing info from a PDF with Type 3 fonts using ElementReader.

Obj type3GlyphStream = font.GetType3GlyphStream(charData.char_code);

if (type3GlyphStream != null) {

using (var reader = new ElementReader()) {


Element el;

while ((el = reader.Next()) != null) {

// fails while iterating





Full test code C# and C++ version is attached.


The problem is that you Type3 content stream is referencing resources (e.g. R13 ExtGState) that is stored in the parent page resource dictionary (i.e. not within Type3 resource dictionary). To fix this you can pass page resource dictionary as the second param to ElementReader.Begin(). For example:

ElementReader.Begin(type3GlyphStream, parent_page.GetResourceDict());