Add AFMoE model support
1ae79d20
Merge remote-tracking branch 'upstream/main' into add_afmoe_model
3a4280cd
Address review feedback for AFMoE implementation
1314162f
Add flex attention support to AFMoE model
89586847
Merge branch 'main' into add_afmoe_model
826cb122
Fix expert_bias routing in AFMoE
ecb74381
Remove test-results directory
045776de
Merge branch 'main' into add_afmoe_model
46ca8d5a
Address PR review feedback for AFMoE model
30c3a206
fix(afmoe): ensure RMSNorm output dtype matches input dtype)
79d9c9a5
properly return attn weights
92ee54cb
fix most tests
8a4049c6
cleanup
06ee2866
fix input embeds api
0b3e060e
update rope API, smaller test and should be good to go
38053239
oups wront place to skip unittest
a7849b57
quality
4cc229cb
update
47093554
rope parameter docstring fill
3e3b0bb9
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub