unstructured
f6fcba43 - Add a check for complex pdfs (#4268)

Commit
8 days ago
Add a check for complex pdfs (#4268) This checks if a pdf file is likely a complex document like mini-holistic-3-v1-Eng_Civil-Structural-Drawing_p001.pdf that is mostly vector graphics by comparing the ratio of vector images to text elements. This limits the overhead to every file by setting a minimum file size before running the check.
Author
Parents
Loading