DeepSpeed
Fix the sequence-parallelism for the dense model architecture
#4530
Merged

Fix the sequence-parallelism for the dense model architecture #4530

mrwyattii merged 8 commits into master from fix-sp-dense
RezaYazdaniAminabadi
fix the sequence-parallelism for the dense models
066644d7
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from jeffra jeffra 2 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from tjruwase tjruwase 2 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from samyam samyam 2 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from mrwyattii mrwyattii 2 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi changed the title Fix the sequence-parallelism for the dense models Fix the sequence-parallelism for the dense model architecture 2 years ago
fix the gradient scale for when zero is not enabled
8d901bfc
fix comm group for allreduce
0bb95947
fix format
aaae9949
samadejacobs samadejacobs removed review request from samyam samyam 2 years ago
samadejacobs samadejacobs requested a review from samadejacobs samadejacobs 2 years ago
tjruwase Merge branch 'master' into fix-sp-dense
7ae577cd
tjruwase tjruwase requested a review from tohtana tohtana 2 years ago
tjruwase
tjruwase commented on 2023-10-21
samadejacobs Allow users to set/override sp comm data type from ds config
01ccf331
samadejacobs Fix formatting
568ae5a6
mrwyattii Merge branch 'master' into fix-sp-dense
bff46e51
mrwyattii mrwyattii merged ec029e76 into master 2 years ago
tjruwase
tjruwase commented on 2023-10-26

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone