Add afmoe model #42168

ArthurZucker merged 19 commits into huggingface:main from add_afmoe_model
alyosha-swamy
alyosha-swamy Add AFMoE model support
1ae79d20
alyosha-swamy Merge remote-tracking branch 'upstream/main' into add_afmoe_model
3a4280cd
ArthurZucker
ArthurZucker commented on 2025-11-14
alyosha-swamy Address review feedback for AFMoE implementation
1314162f
alyosha-swamy Add flex attention support to AFMoE model
89586847
ArthurZucker Merge branch 'main' into add_afmoe_model
826cb122
alyosha-swamy Fix expert_bias routing in AFMoE
ecb74381
alyosha-swamy Remove test-results directory
045776de
ArthurZucker Merge branch 'main' into add_afmoe_model
46ca8d5a
ArthurZucker
ArthurZucker commented on 2025-11-28
ArthurZucker
ArthurZucker approved these changes on 2025-11-28
alyosha-swamy Address PR review feedback for AFMoE model
30c3a206
alyosha-swamy fix(afmoe): ensure RMSNorm output dtype matches input dtype)
79d9c9a5
ArthurZucker
ArthurZucker properly return attn weights
92ee54cb
HuggingFaceDocBuilderDev
ArthurZucker fix most tests
8a4049c6
ArthurZucker cleanup
06ee2866
ArthurZucker fix input embeds api
0b3e060e
ArthurZucker update rope API, smaller test and should be good to go
38053239
ArthurZucker oups wront place to skip unittest
a7849b57
ArthurZucker quality
4cc229cb
ArthurZucker update
47093554
ArthurZucker rope parameter docstring fill
3e3b0bb9
github-actions
ArthurZucker ArthurZucker added New model
ArthurZucker ArthurZucker merged cac0a28c into main 184 days ago
LysandreJik
Rocketknight1

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone