Add afmoe model #42168

alyosha-swamy
alyosha-swamy Add AFMoE model support
1ae79d20
alyosha-swamy alyosha-swamy force pushed from 1b7106e2 to 7c883b7c 73 days ago
alyosha-swamy alyosha-swamy force pushed from 7c883b7c to da228818 73 days ago
alyosha-swamy alyosha-swamy force pushed from da228818 to 6b08d17c 72 days ago
alyosha-swamy alyosha-swamy force pushed from 6b08d17c to e3ad5e94 72 days ago
alyosha-swamy Merge remote-tracking branch 'upstream/main' into add_afmoe_model
3a4280cd
alyosha-swamy alyosha-swamy force pushed from e3ad5e94 to 3a4280cd 72 days ago
ArthurZucker
ArthurZucker commented on 2025-11-14
alyosha-swamy Address review feedback for AFMoE implementation
1314162f
alyosha-swamy Add flex attention support to AFMoE model
89586847
alyosha-swamy alyosha-swamy force pushed from bcd7b97b to 89586847 67 days ago
ArthurZucker Merge branch 'main' into add_afmoe_model
826cb122
alyosha-swamy Fix expert_bias routing in AFMoE
ecb74381
alyosha-swamy alyosha-swamy force pushed from e4aa76e8 to ecb74381 64 days ago
alyosha-swamy Remove test-results directory
045776de
alyosha-swamy alyosha-swamy force pushed from 2c209cad to 965ee80e 64 days ago
alyosha-swamy alyosha-swamy force pushed from 8c6bdb4f to 045776de 64 days ago
ArthurZucker Merge branch 'main' into add_afmoe_model
46ca8d5a
ArthurZucker
ArthurZucker commented on 2025-11-28
alyosha-swamy alyosha-swamy force pushed from 8d78a296 to 02640f43 56 days ago
ArthurZucker
ArthurZucker approved these changes on 2025-11-28
alyosha-swamy Address PR review feedback for AFMoE model
30c3a206
alyosha-swamy alyosha-swamy force pushed from 02640f43 to 30c3a206 56 days ago
alyosha-swamy fix(afmoe): ensure RMSNorm output dtype matches input dtype)
79d9c9a5
ArthurZucker
ArthurZucker properly return attn weights
92ee54cb
HuggingFaceDocBuilderDev
ArthurZucker fix most tests
8a4049c6
ArthurZucker cleanup
06ee2866
ArthurZucker fix input embeds api
0b3e060e
ArthurZucker update rope API, smaller test and should be good to go
38053239
ArthurZucker oups wront place to skip unittest
a7849b57
ArthurZucker quality
4cc229cb
ArthurZucker update
47093554
ArthurZucker rope parameter docstring fill
3e3b0bb9
github-actions
ArthurZucker ArthurZucker added New model
ArthurZucker ArthurZucker merged cac0a28c into main 56 days ago
LysandreJik
Rocketknight1

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone