llama.cpp
61a88a1d
- llama : fix BERT inference without KV cache
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
llama : fix BERT inference without KV cache
References
#7531 - llama : support Jamba hybrid Transformer-Mamba models
Author
compilade
Parents
0fd13e94
Loading