llama.cpp
16bcc125
- kv-cache : pad the cache size to 256 for performance (#17046)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
69 days ago
kv-cache : pad the cache size to 256 for performance (#17046) * kv-cache : pad the size of the small SWA cache for performance * context : pad the total context to 256 * cont : future-proof the swa pad * server : adjust test params to new logic
References
#17046 - kv-cache : pad the cache size to 256 for performance
Author
ggerganov
Parents
9eb9a133
Loading