DeepSpeed
Fix the sequence-parallelism for the dense model architecture
#4530
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
8
Changes
View On
GitHub
Fix the sequence-parallelism for the dense model architecture
#4530
mrwyattii
merged 8 commits into
master
from
fix-sp-dense
fix the sequence-parallelism for the dense models
066644d7
RezaYazdaniAminabadi
requested a review
from
jeffra
2 years ago
RezaYazdaniAminabadi
requested a review
from
tjruwase
2 years ago
RezaYazdaniAminabadi
requested a review
from
samyam
2 years ago
RezaYazdaniAminabadi
requested a review
from
mrwyattii
2 years ago
RezaYazdaniAminabadi
changed the title
Fix the sequence-parallelism for the dense models
Fix the sequence-parallelism for the dense model architecture
2 years ago
fix the gradient scale for when zero is not enabled
8d901bfc
fix comm group for allreduce
0bb95947
fix format
aaae9949
samadejacobs
removed review request
from
samyam
2 years ago
samadejacobs
requested a review
from
samadejacobs
2 years ago
Merge branch 'master' into fix-sp-dense
7ae577cd
tjruwase
requested a review
from
tohtana
2 years ago
tjruwase
commented on 2023-10-21
Allow users to set/override sp comm data type from ds config
01ccf331
Fix formatting
568ae5a6
Merge branch 'master' into fix-sp-dense
bff46e51
mrwyattii
merged
ec029e76
into master
2 years ago
tjruwase
commented on 2023-10-26
Login to write a write a comment.
Login via GitHub
Reviewers
tjruwase
jeffra
mrwyattii
samadejacobs
tohtana
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub