DeepSpeed
[Zero2] Reduce the unnecessary all-reduce when tensor size is 0.
#5868
Merged

Loading