ggml-cpu: FA split across kv for faster TG #19209
am17an
changed the title ggml-cpu: split across kv for faster TG ggml-cpu: FA split across kv for faster TG 123 days ago
am17an
force pushed
from
332f0766
to
efe83e1e
121 days ago
am17an
force pushed
from
efe83e1e
to
88c5fa61
121 days ago
ggerganov
approved these changes
on 2026-02-02
ggml-cpu: split across kv for faster TG
353c85f2
simplify sinks application
8c19a423
add ref impl
3849170c
am17an
force pushed
from
88c5fa61
to
3849170c
121 days ago
am17an
merged
9f682fb6
into master 121 days ago
am17an
deleted the opt-fa-decode branch 121 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub