DeepSpeed
NCCL based 1-bit Implementation + Refactor to add communication backends
#593
Merged

Loading