DeepSpeed
Make bf16_optimizer work for non pipeline parallelism
#2470
Merged

Loading