Update AFMoE architecture to use v5-style MoE impl (#44063)
* v5-style AFMoE impl
* don't unnecessarily return router logits
* inherit MoE code and refactor for stylistic consistency
* remove pointless type alias
* remove legacy cache reference
* type and lint
---------
Co-authored-by: Wing Lian <wing@axolotl.ai>