llama.cpp
85a7d867
- memory : remove KV cache size padding (#16812)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
8 days ago
memory : remove KV cache size padding (#16812) * memory : remove KV cache size padding * cont : restore padding for n_kv tensor shape * server : use slot context size instead of training context size * server : simplify context limit logic
References
#16812 - memory : remove KV cache size padding
Author
ggerganov
Parents
a8ca18b4
Loading