llama.cpp
f4f9367f
- less code duplication, offload k and v separately
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
less code duplication, offload k and v separately
References
#4309 - llama : per-layer KV cache
Author
slaren
Parents
55f2f2fb
Loading