DeepSpeed
4c687bfd - Added device detection to communication logging (#7398)

Commit
87 days ago
Added device detection to communication logging (#7398) In `comms_logging.py`, when calling log_all and the `show_straggler` option is enabled, an all_reduce is performed across all nodes to calculate the minimum latency to find stragglers. However, the tensors on which this is performed are not sent to the configured devices. This commit adds this capability using deepspeed's abstract accelerator api. Resolves #7397 Signed-off-by: Alex Kiefer <alexkiefer51@gmail.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com>
Author
Parents
Loading