DeepSpeed
caba320a - fuse the all_to_all for the seq-parallel into one and use all_to_all_single

Commit
2 years ago
fuse the all_to_all for the seq-parallel into one and use all_to_all_single
Author
Reza Yazdani
Parents
Loading