huggingface_hub
787603ea - feat: pass skip_sha256=True to hf_xet for bucket uploads (#3900)

Commit
34 days ago
feat: pass skip_sha256=True to hf_xet for bucket uploads (#3900) * feat: pass skip_sha256=True to hf_xet for bucket uploads Bucket uploads don't need SHA-256 in the shard metadata (the sha_index GSI is only used for LFS pointer resolution, which doesn't apply to buckets). Pass skip_sha256=True to hf_xet.upload_files() and upload_bytes() in the bucket upload path to skip the SHA-256 computation, removing the main CPU bottleneck on non-SHA-NI instances. Depends on: huggingface/xet-core#679 Co-authored-by: Lucain <Wauplin@users.noreply.github.com> * test: use real bucket upload instead of mocks for skip_sha256 test Replace the two mock-based tests with a single integration test that: - Creates a real Bucket on staging Hub - Uploads files from both filepath and bytes in a single batch - Wraps (not mocks) hf_xet.upload_files and hf_xet.upload_bytes to verify skip_sha256=True is passed - Verifies files are actually uploaded by listing the bucket tree Co-authored-by: Lucain <Wauplin@users.noreply.github.com> * test: skip skip_sha256 test when hf_xet doesn't support it yet The test wraps the real hf_xet functions, so it fails when the installed hf_xet predates the skip_sha256 parameter (xet-core#679). Use inspect.signature to detect support and pytest.skip accordingly. Co-authored-by: Lucain <Wauplin@users.noreply.github.com> * test: handle built-in functions in skip_sha256 signature check hf_xet.upload_files is a compiled built-in function, so inspect.signature() raises ValueError. Catch it and skip the test when the signature can't be introspected (older hf_xet). Co-authored-by: Lucain <Wauplin@users.noreply.github.com> * fix: gracefully fall back when hf_xet lacks skip_sha256 support Use try/except TypeError around upload_files/upload_bytes calls with skip_sha256=True, falling back to calls without it for older hf_xet versions. TypeError for unknown kwargs on compiled functions is raised before any I/O, so the fallback is safe. Update test to check call_args_list[0] (the first attempt always includes skip_sha256=True) instead of requiring the function to accept it. Co-authored-by: Lucain <Wauplin@users.noreply.github.com> * better like this --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Lucain <Wauplin@users.noreply.github.com>
Author
Parents
Loading