llama.cpp
85a7d867 - memory : remove KV cache size padding (#16812)

Commit
8 days ago
memory : remove KV cache size padding (#16812) * memory : remove KV cache size padding * cont : restore padding for n_kv tensor shape * server : use slot context size instead of training context size * server : simplify context limit logic
Author
Parents
Loading