unstructured
d0749d18 - fix: avoid PDF sorting error on negative coords (#1361)

Commit
2 years ago
fix: avoid PDF sorting error on negative coords (#1361) The default sorting algorithm for PDF's, "xycut," would cause an error when partitioning a document if Y coordinate points were negative. This change checks for that condition (or more broadly, any negative coordinates) and falls back to the "basic" sort if that is the case. This PR does not address the underlying issue of "bad points" which still should be investigated. However, the sorting code should be less brittle to unexpected bounding boxes in the first case. Resolves: https://github.com/Unstructured-IO/unstructured/issues/1296
Author
Parents
Loading