unstructured
f0b0e7c9 - fix: filter coordinates kwargs to prevent TypeError in hi_res PDF processing (#4206)

Commit
8 days ago
fix: filter coordinates kwargs to prevent TypeError in hi_res PDF processing (#4206) ## Summary This PR fixes a bug where passing `coordinates=True` to `partition()` causes a TypeError when processing PDFs with the hi_res strategy. ## Problem When users call `partition()` with `coordinates=True`, the boolean value flows through kwargs and eventually reaches `add_element_metadata()`. However, this function already receives computed coordinate data as an explicit parameter. Python then raises: ``` TypeError: add_element_metadata() got multiple values for keyword argument 'coordinates' ``` This is confusing because users reasonably expect `coordinates=True` to enable coordinate output, not realizing that hi_res strategy already computes and includes coordinates automatically. ## Solution Filter out `coordinates` and `coordinate_system` from kwargs before passing them to `add_element_metadata()`. This prevents the conflict while preserving the internally-computed coordinate data. The fix is minimal and targeted - just 3 lines of code that filter the problematic kwargs. ## Changes - `unstructured/partition/pdf.py`: Added filtering for `coordinates` and `coordinate_system` kwargs - `CHANGELOG.md`: Added entry for this fix Fixes #4126
Author
Parents
Loading