llama.cpp
d7b800b8
- llama : pad KV cache size (#4280)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
llama : pad KV cache size (#4280) * llama : pad KV cache size to 32 * metal : try to improve batched decoding
References
#4280 - llama : pad KV cache size
Author
ggerganov
Parents
5a7d3125
Loading