transformers
98a80781 - Fix the misalignment between the l2norm in GDN of Qwen3-Next and the implementation in the FLA library. (#40842)

Commit
92 days ago
Fix the misalignment between the l2norm in GDN of Qwen3-Next and the implementation in the FLA library. (#40842) * align torch implementation of gdn with fla. * fix fla import. * fix * remove unused attr * fixes * strictly align l2norm in Qwen3-Next with FLA implementation. --------- Co-authored-by: bozheng-hit <dsoul0621@gmail.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Author
Parents
Loading