DeepSpeed
MOE gate fixes and enhancements
#5156
Merged

MOE gate fixes and enhancements #5156

tjruwase merged 13 commits into deepspeedai:master from mosheisland:moe/gate
mosheisland
mosheisland mosheisland requested a review from awan-10 awan-10 1 year ago
tjruwase tjruwase requested a review from tohtana tohtana 1 year ago
tjruwase tjruwase removed review request from tohtana tohtana 1 year ago
mosheisland mosheisland force pushed from 4cb1faba to 4706fa2a 1 year ago
mosheisland mosheisland force pushed from 4706fa2a to 8f9d75c4 1 year ago
MOE: Support top2 with disable token dropping
3e0c35f3
MOE: Fix top2 aux loss
692d42df
MOE: Support disable top2 2nd expert sampling
953e6984
MOE: Fix capacity when using TP for non-MoE
64150db0
MOE: Fix gate conversion to fp32
aab9fc3a
mosheisland mosheisland force pushed from 8f9d75c4 to aab9fc3a 1 year ago
awan-10
awan-10 commented on 2024-02-21
awan-10
awan-10 commented on 2024-02-21
awan-10
awan-10 commented on 2024-02-21
awan-10
awan-10 awan-10 assigned awan-10 awan-10 1 year ago
mosheisland
Revert "MOE: Fix top2 aux loss"
1657955c
mosheisland
awan-10
awan-10 approved these changes on 2024-02-22
loadams Merge branch 'master' into moe/gate
05b2262e
mosheisland Merge branch 'master' into moe/gate
6c04451a
mosheisland Merge branch 'master' into moe/gate
c8e05eaa
mosheisland Merge branch 'master' into moe/gate
7eb3633e
mosheisland Merge branch 'master' into moe/gate
b76a541c
mosheisland Merge branch 'master' into moe/gate
3d227ef0
mosheisland Merge branch 'master' into moe/gate
30118cf1
tjruwase tjruwase merged 5a2e705b into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
Labels
Milestone