unstructured
fix: Correctly patch pdfminer to avoid unnecessarily and unsuccessfully repairing PDFs with long content streams, causing needless and endless OCR
#3822
Merged

fix: Correctly patch pdfminer to avoid unnecessarily and unsuccessfully repairing PDFs with long content streams, causing needless and endless OCR #3822

dhdaines
dhdaines fix: correctly patch EOF handling in pdfminer (fixes: #3815)
e0f464a1
dhdaines chore: add missing newline
1637377d
dhdaines docs: clarify exactly what we are patching here
7d87840a
fix: correct the import of PSSyntaxError
39b2472c
docs: document what patch_psparser does
99b1c616
chore: ruff
6cba88ac
dhdaines dhdaines changed the title Fix the fix to pdfminer Correctly patch pdfminer to avoid unnecessarily and unsuccessfully repairing PDFs with long content streams, causing needless and endless OCR 1 year ago
dhdaines chore: changelog
1b0c7f75
dhdaines dhdaines changed the title Correctly patch pdfminer to avoid unnecessarily and unsuccessfully repairing PDFs with long content streams, causing needless and endless OCR fix: Correctly patch pdfminer to avoid unnecessarily and unsuccessfully repairing PDFs with long content streams, causing needless and endless OCR 1 year ago
PhorstenkampFuzzy
qued Merge branch 'main' into fix_the_fix_to_pdfminer
b9747206
qued qued requested a review from qued qued 1 year ago
qued
qued approved these changes on 2025-01-24
qued Merge branch 'main' into fix_the_fix_to_pdfminer
8264d733
qued Update version
897ffad3
qued
qued qued merged 9e5ff225 into main 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone