transformers
4c040aba - [mistral] Support passing `head_dim` through config (and do not require `head_dim * num_heads == hidden_size`) (#32050)

Commit
1 year ago
[mistral] Support passing `head_dim` through config (and do not require `head_dim * num_heads == hidden_size`) (#32050) * Allow `head_dim` to be set in Mistral config * Add docstring * Do not require `head_dim * num_heads == hidden_size` * [run-slow] mistral
Author
Parents
Loading