transformers
Support `num_attention_heads` != `num_key_value_heads` in Flax Llama Implementation
#29557
Merged

Loading