llama.cpp
16bcc125 - kv-cache : pad the cache size to 256 for performance (#17046)

Commit

141 days ago

kv-cache : pad the cache size to 256 for performance (#17046) * kv-cache : pad the size of the small SWA cache for performance * context : pad the total context to 256 * cont : future-proof the swa pad * server : adjust test params to new logic

References

#17046 - kv-cache : pad the cache size to 256 for performance

Author

ggerganov

Parents

9eb9a133

llama.cpp 16bcc125 - kv-cache : pad the cache size to 256 for performance (#17046)

llama.cpp
16bcc125 - kv-cache : pad the cache size to 256 for performance (#17046)