unstructured
78dfb309 - feat: tablechunks can reconstruct table (#4291)

Commit
27 days ago
feat: tablechunks can reconstruct table (#4291) <!-- CURSOR_SUMMARY --> > [!NOTE] > **Medium Risk** > Changes core table-chunking behavior by adding new metadata fields and reconstruction logic; risk is mainly around backward compatibility and correct ordering/HTML merging of split tables. > > **Overview** > Adds end-to-end support for reassembling split tables after chunking. `TableChunk` now receives stable sequencing metadata (`table_id`, `chunk_index`) when a `Table` is split, and a new `reconstruct_table_from_chunks()` helper in `unstructured.chunking.dispatch` groups and merges `TableChunk`s back into full `Table` elements (including merged `text_as_html` when available). > > Updates `ElementMetadata` to carry the new fields (dropped during consolidation), bumps version to `0.22.4`, and adds unit tests covering reconstruction across mixed element streams and edge cases like missing `chunk_index`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 1e732a3e366d6f3f93c898ad2e5e9944855bf0fa. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: cragwolfe <crag@unstructured.io>
Author
Parents
Loading