DeepSpeed
stage_1_and_2.py: no allreduce needed when mp size is 1
#2494
Merged

Loading