DeepSpeed
Enabled high-performance Automatic Tensor Parallelism (auto TP) for the MoE models on multiple GPUs/HPUs
#6964
Open

Loading