Add z-loss to Bamba for v2 (#37842)
* Remove const
* Fix arg ref
* Sharded save
* Add z_loss flag
* Add modeling zloss
* Demodularize clm forward for zloss
* Also demodularize init for z_loss flag
* PR comments (mostly modularizing right)
* Demodularize forward
* Better name zloss and explain typematch
* Fully propagate coeff name
* style fixes
* zloss default float
* Remove conflicting annotations
---------
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>