transformers
a6f0e2b6 - Add z-loss to Bamba for v2 (#37842)

Commit
230 days ago
Add z-loss to Bamba for v2 (#37842) * Remove const * Fix arg ref * Sharded save * Add z_loss flag * Add modeling zloss * Demodularize clm forward for zloss * Also demodularize init for z_loss flag * PR comments (mostly modularizing right) * Demodularize forward * Better name zloss and explain typematch * Fully propagate coeff name * style fixes * zloss default float * Remove conflicting annotations --------- Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
Author
Parents
Loading