unstructured
feat: use lxml instead of bs4 to parse hOCR data
#3960
Merged

feat: use lxml instead of bs4 to parse hOCR data #3960

badGarnet merged 2 commits into main from feat/use-lxml-to-parse-hocr
badGarnet
badGarnet feat: use lxml instead of bs4 to parse hOCR data
1f28d8b9
badGarnet fix: fix test by building etree objects
ae8b0597
badGarnet badGarnet marked this pull request as ready for review 309 days ago
ryannikolaidis
ryannikolaidis approved these changes on 2025-03-17
badGarnet badGarnet merged 4e424efd into main 309 days ago
badGarnet badGarnet deleted the feat/use-lxml-to-parse-hocr branch 309 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone