DeepSpeed
d5fa87ff - Write multiple items to output file at once, in distributed data analyzer. (#5169)

Commit
1 year ago
Write multiple items to output file at once, in distributed data analyzer. (#5169) Minor improvements of [https://github.com/microsoft/DeepSpeed/pull/5129](https://github.com/microsoft/DeepSpeed/pull/5129). - Writes all buffers at once to the output file, instead of iteratively (`indexed_dataset.py`, method `add_items()`). - Fixes the wrong initialisation of `num_workers` and `worker_id` that were being ignored when they were provided by the user. --------- Co-authored-by: Conglong Li <conglong.li@gmail.com>
Author
Parents
Loading