unstructured
6595632a - enhancement: backup text categorization (#1322)

Commit
2 years ago
enhancement: backup text categorization (#1322) Currently there are some cases when `partition_pdf` is run using the `hi_res` strategy, in which elements can come back with category `UncategorizedText`. This happens when the detection model fails to detect an element, but we're able to find it anyway either because it was embedded in the PDF, or we found it using OCR. This commit is to allow for attempting to categorize these uncategorized elements using our text-based classification function, `element_from_text`.
Author
Parents
Loading