DeepSpeed
Fix expert grad scaling problem with ZeRO optimizer
#6546
Merged

Fix expert grad scaling problem with ZeRO optimizer #6546

wyooyw
Fix Expert Grad Scale Problem With Zero Optimizer
607d8c9d
wyooyw wyooyw requested a review from tjruwase tjruwase 1 year ago
wyooyw wyooyw requested a review from loadams loadams 1 year ago
wyooyw
wyooyw wyooyw changed the title Fix Expert Grad Scaling Problem With Zero Optimizer Fix expert grad scaling problem with ZeRO optimizer 1 year ago
tjruwase tjruwase removed review request from loadams loadams 1 year ago
tjruwase tjruwase requested a review from tohtana tohtana 1 year ago
tohtana
tohtana commented on 2024-09-17
ranzhejiang
remove useless code
5a44f8c0
wyooyw
ranzhejiang
ranzhejiang commented on 2024-09-18
remove useless comments
b1231c48
wyooyw wyooyw force pushed from 6e1e90c1 to b1231c48 1 year ago
tohtana Merge branch 'master' into fix_expert_weight_grad_with_zero
14d002df
loadams Merge branch 'master' into fix_expert_weight_grad_with_zero
76dda2a4
tohtana Merge branch 'master' into fix_expert_weight_grad_with_zero
d0de160c
tohtana Merge branch 'master' into fix_expert_weight_grad_with_zero
28b2aff4
tohtana tohtana enabled auto-merge 1 year ago
tohtana
tohtana approved these changes on 2024-10-14
tohtana tohtana merged b647fb24 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone