unstructured
Switch from pdfminer to paves to improve robustness and use multiple CPUs
#4067
Open
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
12
Changes
View On
GitHub
Switch from pdfminer to paves to improve robustness and use multiple CPUs
#4067
dhdaines
wants to merge 12 commits into
Unstructured-IO:main
from
dhdaines:switch_from_pdfminer_to_paves
feat: switch from pdfminer to paves
a1b94ccd
fix: manually hack deps since who knows how they get generated
ac2b2e78
chore: black and ruff
6cd328da
fix(tests): repair no longer necessary
a5f00e53
fix: avoid importing pypdf just to count pages!
8ec45e02
fix: playa needs "" as default password not None
a489d295
fix: require playa-pdf 0.6.2 for colormap issue
318a954a
fix: isort
2f87d893
fix(tests): playa/paves do not output (cid:N) droppings
e79845f1
fix(tests): update indices since (cid:N) no longer occurs
afb12882
fix(tests): update markdown and html fixtures
ea36f107
fix(tests): fix missing or not missing newline for silly diff
e9997348
Login to write a write a comment.
Login via GitHub
Reviewers
No reviews
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub