accelerate
3e27a7c6 - Fix iterable dataset sharding condition when n_shards == num_processes (#3958)

Commit

78 days ago

Fix iterable dataset sharding condition when n_shards == num_processes (#3958) * Fix iterable dataset sharding condition when n_shards == num_processes Use >= instead of > so native HF dataset sharding is used when the shard count exactly matches the process count, instead of falling back to the less efficient IterableDatasetShard wrapper. * Fix iterable dataset sharding condition when n_shards == num_processes Use >= instead of > so native HF dataset sharding is used when the shard count exactly matches the process count, instead of falling back to the less efficient IterableDatasetShard wrapper. * nice

References

#3958 - Fix iterable dataset sharding condition when n_shards == num_processes

Author

SunMarc

Parents

74427420

accelerate 3e27a7c6 - Fix iterable dataset sharding condition when n_shards == num_processes (#3958)

accelerate
3e27a7c6 - Fix iterable dataset sharding condition when n_shards == num_processes (#3958)