What exactly is a "flow" in TextExtractor?


what exactly is a “flow”? The description of TextExtractor::Line::GetFlowID

The unique identifier for a paragraph or column that this line belongs to.

This is confusing as columns are usually made up of paragraphs.

Do you have a more precise definition of “flow”? For instance, does a flow
always have rectangular shape?


Flow is a logical construct used to represent a reading order. A flow contains a sequences of blocks (paragraphs / columns) representing the reading order on a given page.

Most pages contain only a single flow, but it could contain 2 or more – e.g. if half of the page is portrait and the content on the other half is rotated 90 degrees.