transformers
be8d8a4c - Update AFMoE architecture to use v5-style MoE impl (#44063)

Commit
3 days ago
Update AFMoE architecture to use v5-style MoE impl (#44063) * v5-style AFMoE impl * don't unnecessarily return router logits * inherit MoE code and refactor for stylistic consistency * remove pointless type alias * remove legacy cache reference * type and lint --------- Co-authored-by: Wing Lian <wing@axolotl.ai>
Parents
Loading