DeepSpeed
368976fc - DeepSpeed Sequence (#535)

Commit
2 years ago
DeepSpeed Sequence (#535) * DS sequence impl * add communication groups for sequence parallelism * add all_to_all to torch comm backend --------- Co-authored-by: Sam Ade Jacobs <samjacobs@microsoft.com> Co-authored-by: Masahiro Tanaka <mtanaka@microsoft.com> Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
Author
Parents
Loading