unstructured
f5ebb209 - rfctr(html): drop page concept (#3184)

Commit
1 year ago
rfctr(html): drop page concept (#3184) **Summary** Pagination of HTML documents is currently unused. The `Page` class and concept were deeply embedding in the legacy organization of HTML partitioning code due to the legacy `Document` (= pages of elements) domain model. Remove this concept from the code such that elements are available directly from the partitioner. **Additional Context** - Pagination can be re-added later if we decide we want it again. A re-implementation would be much simpler and much lower impact to the structure of the code and introduce much less additional complexity, similar to the approach we take in `partition_docx()`.
Author
Parents
Loading