enhancement: add "ocr_only" strategy for PDFs #553
add tests for validating strategy
f652048f
refactor into determine_pdf_strategy function
919ace23
refactor pdf strategies into strategies
30d31d69
remove commented out code
a1d55e35
remove unreachable code
de2d7ae9
add in handling for image types
30f1739f
a little more refactoring
4c1a9ae7
import ocr partioning for images
e211c29a
catch warnings, partition type for valid strategies
8e39cd4f
fallback to ocr_only from fast
c35e7b85
fallback logic for hi_res
e77dff96
test for fallback to ocr only
e849651e
fallback logic ofr ocr_only
7488e278
more tests for fallback logic
93af828e
update doc strings
4485e901
version and changelog
471dc77c
linting, linting, linting
34e06726
update docs to include notes about strategy
07b30514
Merge branch 'main' into enhancement/ocr-only-for-pdfs
52c5db1c
MthwRobinson
marked this pull request as ready for review 2 years ago
qued
approved these changes
on 2023-05-08
fix typos
6c2b9910
Merge branch 'enhancement/ocr-only-for-pdfs' of github.com:Unstructur…
6c82c513
change back patched filename
f487924d
bump version for release
eeb5abbb
MthwRobinson
deleted the enhancement/ocr-only-for-pdfs branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub