llama.cpp
9f682fb6 - ggml-cpu: FA split across kv for faster TG (#19209)

Commit
18 days ago
ggml-cpu: FA split across kv for faster TG (#19209) * ggml-cpu: split across kv for faster TG * simplify sinks application * add ref impl
Author
Parents
Loading