allreduce_always_fp16 (#1487)

Commit

4 years ago

allreduce_always_fp16 (#1487) * fp16 allreduce * Undo sparse sum in nan check * communication_data_type instead of fp32_allreduce and fp16_allreduce * sparse_allreduce with fp32 or fp16 data type * FIx communication_data_type checks * Allow only torch data types for communication_data_type * Fix Zero assert messages Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>

References

#1487 - allreduce_always_fp16

Author

Dipet

Parents

52c7889b

DeepSpeed d14baad9 - allreduce_always_fp16 (#1487)

DeepSpeed
d14baad9 - allreduce_always_fp16 (#1487)