unstructured
208c7edc - rfctr(csv): minify HTML and table text is cct (#3733)

Commit
1 year ago
rfctr(csv): minify HTML and table text is cct (#3733) **Summary** Eliminate historical "idiosyncracies" of `table.metadata.text_as_html` HTML introduced by `partition_csv()`. Produce minified `.text_as_html` consistent with that formed by chunking. **Additional Context** - CSV `.metadata.text_as_html` is minified (no extra whitespace or thead, tbody, tfoot elements). - `table.text` is clean-concatenated-text (CCT) of table. --------- Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: scanny <scanny@users.noreply.github.com>
Author
Parents
Loading