llama.cpp
9f682fb6
- ggml-cpu: FA split across kv for faster TG (#19209)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
18 days ago
ggml-cpu: FA split across kv for faster TG (#19209) * ggml-cpu: split across kv for faster TG * simplify sinks application * add ref impl
References
#19209 - ggml-cpu: FA split across kv for faster TG
Author
am17an
Parents
a3fa0358
Loading