feat: `partition_pdf()` support language specification for PaddleOCR #3400
refactor: pass through ocr_languages
5b53799c
feat: convert TesseractOCR language code to PaddleOCR language code
2c31ec8c
feat: add support for specifying OCR languages when instantiating an …
29b4cbe3
refactor: remove `ocr_languages` param
e0e1227d
refactor: update `image_or_pdf_to_dataframe` to handle changes in `ge…
3d50b31e
Merge branch 'refs/heads/main' into feat/pdf-support-paddleocr-langua…
725e6253
test: update unit test
a54166f0
chore: update changelog & version
66b56a93
feat: handle invalid language code when converting Tesseract language…
73e9346a
scanny
approved these changes
on 2024-07-16
test: add unit test for `tesseract_to_paddle_language()`
106518be
Merge branch 'refs/heads/main' into feat/pdf-support-paddleocr-langua…
2c9d9ea6
chore: update log
df6ace61
feat: remove constant `DEFAULT_PADDLE_LANG`
dd66e749
refactor: update default language setting
02dcec3b
christinestraub
deleted the feat/pdf-support-paddleocr-language-specification branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub