unstructured
fix: remove duplicate characters caused by fake bold rendering in PDFs
#4215
Merged

fix: remove duplicate characters caused by fake bold rendering in PDFs #4215

badGarnet merged 25 commits into Unstructured-IO:main from fix/remove-pdf-bold-text-duplication
bittoby
bittoby fix: remove duplicate characters caused by fake bold rendering in PDFs
f8af84b5
bittoby
badGarnet
bittoby fix: solve merge conflict
8d80a34f
bittoby fix: apply character deduplication to fast strategy for fake-bold PDFs
92c02d68
bittoby
bittoby fix: define imports at the top
83773989
bittoby test: simplify fake-bold integration test assertions
d817d42c
badGarnet
badGarnet commented on 2026-02-02
badGarnet
badGarnet commented on 2026-02-02
bittoby Merge branch 'main' of https://github.com/bittoby/unstructured into f…
e0803a38
bittoby fix: improve fake-bold deduplication tests with specific assertions
3d11da7b
bittoby
bittoby Merge branch 'main' of https://github.com/bittoby/unstructured into f…
90a82c24
bittoby fix: remove unused pytest import to pass ruff linter
355e9255
bittoby bittoby force pushed from 29d32e5e to 355e9255 52 days ago
bittoby
badGarnet
badGarnet approved these changes on 2026-02-05
badGarnet badGarnet enabled auto-merge 50 days ago
bittoby fix: black formatting violations in PDF test files for CI/CD compliance
14d1231d
disabled auto-merge 50 days ago
Head branch was pushed to by a user without write access
bittoby
bittoby
badGarnet
badGarnet requested changes on 2026-02-05
bittoby fix: Update code formatting and element ID to match new deterministri…
80e27740
bittoby
bittoby bittoby requested a review from badGarnet badGarnet 50 days ago
badGarnet
badGarnet commented on 2026-02-06
badGarnet
badGarnet commented on 2026-02-06
bittoby fix: Update CHANGELOG
68fc61c8
bittoby fix: recover origin 0.18.35
0728ec02
bittoby
bittoby bittoby requested a review from badGarnet badGarnet 49 days ago
badGarnet
badGarnet approved these changes on 2026-02-06
bittoby
bittoby
badGarnet
bittoby fix: improve PDF fake-bold deduplication by adding bounding box overl…
fb1067d4
bittoby
bittoby fix: solve merge conflict
1894ed11
badGarnet
badGarnet commented on 2026-02-10
bittoby fix: make pdf character overlap ratio threshold configurable via PDF_…
5f7c2e65
bittoby
bittoby fix: resolve merge conflict
91003473
bittoby
badGarnet badGarnet enabled auto-merge 44 days ago
bittoby fix: resolve Lint style error
66beb1e4
disabled auto-merge 44 days ago
Head branch was pushed to by a user without write access
bittoby fix:resolve conflict
d5bc389a
bittoby fix: Update test lint style
a0a7c7b7
badGarnet
bittoby fix: Update version and changelog
90584aad
bittoby
badGarnet badGarnet enabled auto-merge 43 days ago
bittoby fix: solve test error
a84a5f74
disabled auto-merge 43 days ago
Head branch was pushed to by a user without write access
bittoby
bittoby fix: solve merge conflict
55957bae
bittoby fix: update changlog
99880cfc
bittoby
badGarnet badGarnet enabled auto-merge 42 days ago
bittoby fix: update log and version
f97dca57
disabled auto-merge 42 days ago
Head branch was pushed to by a user without write access
bittoby
badGarnet badGarnet enabled auto-merge 42 days ago
badGarnet badGarnet merged 8096b5af into main 42 days ago
bittoby

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone