unstructured
rfctr(html): drop HTML-specific elements
#3207
Merged

rfctr(html): drop HTML-specific elements #3207

scanny merged 11 commits into main from scanny/drop-html-elements
scanny
scanny scanny requested a review from Coniferish Coniferish 2 years ago
scanny scanny force pushed from d48b6b7f to 7e979a83 2 years ago
Coniferish
Coniferish commented on 2024-06-14
scanny scanny force pushed from 7e979a83 to 2edc0f54 2 years ago
scanny scanny force pushed from 2edc0f54 to a996b96f 2 years ago
Coniferish
Coniferish approved these changes on 2024-06-14
scanny scanny force pushed from a996b96f to 4f8bd9c6 2 years ago
scanny scanny force pushed from 4f8bd9c6 to cbc0c4df 2 years ago
scanny rfctr(html): adopt main tag sub-partitioners
273680de
scanny rfctr(html): extract emph meta role from TagsMixin
166392b2
scanny rfctr(html): drop unused TagsMixin.tag attribute
6eed2901
scanny rfctr(html): drop text_as_html role from TagsMixin
fa82cb88
scanny rfctr(html): drop link meta role from TagsMixin
688705f6
scanny rfctr(html): drop HTML-specific element types
a1cfa5bf
scanny rfctr(html): extract ._classify_text()
43869d43
scanny rfctr(html): inline tiny classifiers
0d80f4e6
scanny rfctr(html): adopt text-classifiers
e6f2fd0e
scanny chore: bump CHANGELOG + __version__
b58ebf98
scanny fix: small CI fixes
3824fe4a
scanny scanny force pushed from cbc0c4df to 3824fe4a 2 years ago
scanny scanny merged 9fae0111 into main 2 years ago
scanny scanny deleted the scanny/drop-html-elements branch 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone