Added device detection to communication logging (#7398)
In `comms_logging.py`, when calling log_all and the `show_straggler`
option is enabled, an all_reduce is performed across all nodes to
calculate the minimum latency to find stragglers. However, the tensors
on which this is performed are not sent to the configured devices. This
commit adds this capability using deepspeed's abstract accelerator api.
Resolves #7397
Signed-off-by: Alex Kiefer <alexkiefer51@gmail.com>
Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com>