DeepSpeed
Fix the MoE-params gradient-scaling
#4957
Merged

Fix the MoE-params gradient-scaling #4957

RezaYazdaniAminabadi
RezaYazdaniAminabadi Fix the MoE-params gradient-scaling
ecd102f1
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from tjruwase tjruwase 2 years ago
RezaYazdaniAminabadi RezaYazdaniAminabadi requested a review from mrwyattii mrwyattii 2 years ago
RezaYazdaniAminabadi Merge branch 'master' into fix-moe-grad-scaling
35b79795
RezaYazdaniAminabadi
tjruwase
tjruwase approved these changes on 2024-01-17
RezaYazdaniAminabadi Merge branch 'master' into fix-moe-grad-scaling
a0868e3d
RezaYazdaniAminabadi
RezaYazdaniAminabadi Merge branch 'master' into fix-moe-grad-scaling
cf2af492
RezaYazdaniAminabadi Merge branch 'master' into fix-moe-grad-scaling
f8f8ef99
RezaYazdaniAminabadi
tjruwase tjruwase merged 9d2660d2 into master 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone