allreduce_always_fp16 (#1487)
* fp16 allreduce
* Undo sparse sum in nan check
* communication_data_type instead of fp32_allreduce and fp16_allreduce
* sparse_allreduce with fp32 or fp16 data type
* FIx communication_data_type checks
* Allow only torch data types for communication_data_type
* Fix Zero assert messages
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>