transformers
10e97cd5 - Fix Mamba2ForCausalLM weight tying (#43207)

Commit
31 days ago
Fix Mamba2ForCausalLM weight tying (#43207) * Fix Mamba2ForCausalLM weight tying Add _tied_weights_keys mapping to enable proper weight tying when tie_word_embeddings=True. This is the standard pattern used by MambaForCausalLM, GPT2, LLaMA, and other models. Fixes #43206 * Enable weight tying in Mamba2ModelTester for regression testing * Add explicit regression test for Mamba2 weight tying Replace ModelTester default with explicit test per reviewer feedback.
Author
Parents
Loading