transformers
1391366f - Mamba-1/-2 init weights in mixer class (#43778)

Commit
1 day ago
Mamba-1/-2 init weights in mixer class (#43778) * Move dt_bias init into Mamba2Mixer * Remove redundant init from Mamba2PreTrainedModel * Add back init for meta device * Add dt_bias init into MambaMixer * Formatting code * [mamba,mamba2] add local init * fix check * style --------- Co-authored-by: Arthur <arthur.zucker@gmail.com>
Author
Parents
Loading