unstructured
27cd53bd - fix: fix multiple values for infer_table_structure (#3870)

Commit
1 year ago
fix: fix multiple values for infer_table_structure (#3870) This PR fixes a bug when using `partition` to partition an email with image attachments with hi_res and allow table structure inference -> the partitioning of the image would encounter a value error: `got multiple values for keyword argument 'infer_table_structure'`. This is because pass `kwargs` into partition "other" types of files in this [block](https://github.com/Unstructured-IO/unstructured/blob/50ea6fe7fc324efa09398898dc35d0cd4e78b1cf/unstructured/partition/auto.py#L270-L280) `infer_table_structure` is packaged into `partitioning_kwargs`. Then for email at least when there are attachments that can be partitioned with `hi_res` we pass that dict of `kwargs` right back into `partition` entry -> so when we get [here](https://github.com/Unstructured-IO/unstructured/blob/50ea6fe7fc324efa09398898dc35d0cd4e78b1cf/unstructured/partition/auto.py#L222-L235) we are both specifying explicitly `infer_table_structure` and have it in `kwargs` variable The fix is to detect first if `kwargs` already contains `infer_table_structure` and if yes use that and pop it from `kwargs`. --------- Co-authored-by: Kamil Plucinski <kamil.plucinski@deepsense.ai> Co-authored-by: christinestraub <christinemstraub@gmail.com> Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: christinestraub <christinestraub@users.noreply.github.com>
Author
Parents
Loading