unstructured
6360ef7f - fix: isolate Table elements in pre-chunks (#4307)

Commit
4 days ago
fix: isolate Table elements in pre-chunks (#4307) ## Summary This change enforces the documented table-isolation guarantees in chunking: - Table and TableChunk are always staged in their own pre-chunk and never combined with adjacent non-table elements into a CompositeElement. - PreChunkCombiner will not merge pre-chunks when either side contains a table-family element, preventing “table gets wrapped/merged” behavior when combine_text_under_n_chars is enabled. - Shared helper functions centralize the table-isolation checks in unstructured.chunking.base. Also includes: - Updated/adjusted chunking tests to reflect the new behavior. - Added a dedicated test_table_isolation.py regression suite. - Version bump + CHANGELOG.md entry to document the fix. Closes #3921
Parents
Loading