unstructured
0cd07d78 - feat: `parition_pdf()` add ability to get `cid` ratio (#2970)

Commit
1 year ago
feat: `parition_pdf()` add ability to get `cid` ratio (#2970) This PR adds the ability to get the ratio of `cid` characters in embedded text extracted by `pdfminer`. This PR is the second part of moving `cid` related code from `unstructured-inference` to `unstructured` and works together with https://github.com/Unstructured-IO/unstructured-inference/pull/342.
Parents
Loading