ggml-cpu: FA split across kv for faster TG #19209
am17an
changed the title ggml-cpu: split across kv for faster TG ggml-cpu: FA split across kv for faster TG 27 days ago
am17an
force pushed
from
332f0766
to
efe83e1e
25 days ago
am17an
force pushed
from
efe83e1e
to
88c5fa61
25 days ago
ggerganov
approved these changes
on 2026-02-02
ggml-cpu: split across kv for faster TG
353c85f2
simplify sinks application
8c19a423
add ref impl
3849170c
am17an
force pushed
from
88c5fa61
to
3849170c
25 days ago
am17an
merged
9f682fb6
into master 25 days ago
am17an
deleted the opt-fa-decode branch 25 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub