Refactor: support merging `extracted` layout with `inferred` layout #2158
feat: add functionality to merge `inferred` with `extracted` when `fi…
897f97ca
feat: add functionality to merge `inferred` with `extracted` when `fi…
57eba400
Merge branch 'main' into refactor/pdf_text_extraction_for_hi_res
17ec904b
feat: sort extracted layout by deterministic ordering
e8880476
chore: add force `pip install -e .`
c637ed03
chore: update changelog & version
68aacd34
fix: lint
8669d8b6
chore: update `flake8` config to exclude `unstructured-inference` di…
bb6a16a4
feat: reflect added `Source.PDFMINER` constant
f66cda65
chore: update ci
0e8e4664
refactor: import `order_layout` within function
646a29d7
test: fix lint errors
f8f004dd
Merge branch 'main' into refactor/pdf_text_extraction_for_hi_res
8cd75a40
test: fix unit test errors
d2e2e071
refactor: organize files for partitioning pdf/image
4d2d1900
refactor: add a new module `pdfminer_processing`
4abefa92
feat: update `_merge_inferred_with_extracted()` to get image size fro…
5978279d
refactor: `_merge_inferred_with_extracted()`
1a4083af
test: update module import
f0be24c9
Merge branch 'main' into refactor/pdf_text_extraction_for_hi_res
fbfe8def
chore: update version
62b1513d
feat: use elements returned by `inference.PageLayout.get_elements_fro…
dff68e63
fix: lint errors
149444a9
refactor: move code related to `pdfminer` patch from `unstructured-in…
02190409
test: fix unit test errors
a42b7e6a
refactor: move `_merge_inferred_with_extracted()` to pdfminer_process…
383c4965
Merge branch 'main' into refactor/pdf_text_extraction_for_hi_res
d1489509
test: fix lint errors
c204cf54
feat: import modules depend on `unstructured_inference` library only …
2603374b
Merge branch 'main' into refactor/pdf_text_extraction_for_hi_res
62ea5e86
Merge branch 'main' into refactor/pdf_text_extraction_for_hi_res
7de19d57
refactor: use `init_pdfminer()` in `_open_pdfminer_pages_generator()`
e6f6511a
Merge branch 'main' into refactor/pdf_text_extraction_for_hi_res
d8bf20c9
chore: update changelog & version
dd889995
chore: update ci
1327055b
feat: use the `open_pdfminer_pages_generator()` procedure in the `hi_…
fe29e79e
chore: revert all CI yaml changes
4126e873
chore: bump unstructured-inference==0.7.17
d801ed9c
Merge branch 'main' into refactor/pdf_text_extraction_for_hi_res
d2fa91f1
chore: make pip-compile
651221f5
fix: dependency path error when running pip-compile
6bf43d7e
chore: make pip-compile
f0f07ab5
benjats07
approved these changes
on 2023-12-01
Merge branch 'main' into refactor/pdf_text_extraction_for_hi_res
4fba0b66
chore: make pip-compile
bcea80fd
chore: update version
f2e5128b
christinestraub
deleted the refactor/pdf_text_extraction_for_hi_res branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub