DeepSpeed
Fix moe cpu offload
#5220
Merged

Fix moe cpu offload #5220

RezaYazdaniAminabadi
sfc-gh-reyazda Skip gradient-norm averaging when cpu-device is selected
feb578af
sfc-gh-reyazda fix formatting
9eb6129e
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from tjruwase tjruwase 1 year ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from mrwyattii mrwyattii 1 year ago
RezaYazdaniAminabadi Merge branch 'master' into fix-moe-cpu-offload
9a59df40
tjruwase
tjruwase commented on 2024-03-02
sfc-gh-reyazda average the grad-norms by sending the gradients to GPU when using off…
ae4bc936
sfc-gh-reyazda Merge branch 'fix-moe-cpu-offload' of https://github.com/RezaYazdaniA…
d8255e07
RezaYazdaniAminabadi Merge branch 'master' into fix-moe-cpu-offload
4fc1d8b9
tjruwase
tjruwase approved these changes on 2024-03-04
mrwyattii mrwyattii merged e6e8c137 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone