unstructured
Switch from pdfminer to paves to improve robustness and use multiple CPUs
#4067
Open

Switch from pdfminer to paves to improve robustness and use multiple CPUs #4067

dhdaines
feat: switch from pdfminer to paves
a1b94ccd
dhdaines fix: manually hack deps since who knows how they get generated
ac2b2e78
dhdaines chore: black and ruff
6cd328da
dhdaines fix(tests): repair no longer necessary
a5f00e53
fix: avoid importing pypdf just to count pages!
8ec45e02
fix: playa needs "" as default password not None
a489d295
fix: require playa-pdf 0.6.2 for colormap issue
318a954a
fix: isort
2f87d893
fix(tests): playa/paves do not output (cid:N) droppings
e79845f1
fix(tests): update indices since (cid:N) no longer occurs
afb12882
fix(tests): update markdown and html fixtures
ea36f107
fix(tests): fix missing or not missing newline for silly diff
e9997348

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone