I am attempting to accurately scrape all horizontal and vertical lines from a PDF document. This of course is part of a larger endevour, but is a fundamental needed to accomplish the larger goal. I find I can get position information for horizontal lines fine. And height and width of vertical lines is correct, but the vertical position of vertical lines is consistently off. There must be some transform I need to take into account?
To demonstrate I boiled this down to a very simple application which attempts to draw the lines of source document page 23 on the screen, making the issue visually obvious. Code for main form below. For my purposes I convert all rectangles to a coordinate system with the origin at the upper left corner of the page, increasing ordinates downward and to the right, and scaled into inches.
The source document can be downloaded from here:
Any pointers you can provide would be greatly appreciated.
http://www.idealliance.org/downloads/md142fsv14213
Imports pdftron
Imports pdftron.Common
Imports pdftron.PDF
Public Class Form1
Private PageBoundaries As RectangleF
Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
Dim gr As Graphics = pb1.CreateGraphics`` ' pb1 is a picturebox on Form1 width=1109, height=908
gr.ScaleTransform(100, 100)
Dim PathToDoc As String = "C:\YourFolderName\MD_14_2_FS_v14.2.1.3.pdf"
Dim pdfdoc As PDFDoc
Dim page As Page
Dim reader As New ElementReader
Dim element As Element
pdfdoc = New PDFDoc(PathToDoc)
pdfdoc.InitSecurityHandler()
page = pdfdoc.GetPage(23)
PageBoundaries = PageInches(page.GetCropBox)
reader.Begin(page)
Debug.WriteLine("=======================================================")
element = reader.Next
While element IsNot Nothing
If element.GetType = PDF.Element.Type.e_path Then
If element.IsFilled Then
Dim elementRct As New Rect
element.GetBBox(elementRct)
Dim elemInches As RectangleF = PageInches(elementRct)
Debug.WriteLine("Height=" & elemInches.Height.ToString & " Width=" & elemInches.Width.ToString & " Top=" & elemInches.Top.ToString & " Bot=" & elemInches.Bottom.ToString & " Lft=" & elemInches.Left.ToString & " Rgt=" & elemInches.Right.ToString)
gr.FillRectangle(Brushes.Black, elemInches.X, elemInches.Y, elemInches.Width, elemInches.Height)
End If
End If
element = reader.Next
End While
reader.End()
reader.Dispose()
reader = Nothing
page = Nothing
pdfdoc.Close()
pdfdoc.Dispose()
pdfdoc = Nothing
gr.Dispose()
gr = Nothing
End Sub
Public Function PageInches(ByVal srcRect As Rect) As RectangleF
Dim PageRect As New RectangleF(CSng(srcRect.x1 / 72.0), CSng(PageBoundaries.Bottom - (srcRect.y1 / 72.0)), CSng(srcRect.Width / 72.0), CSng(srcRect.Height / 72.0))
Return PageRect
End Function
End Class
--
Lee Gillie, CCP
Online Data Processing, Inc.