DeepSpeed
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2
#2999
Merged

Loading