llama.cpp
ggml-cpu: FA split across kv for faster TG
#19209
Merged

ggml-cpu: FA split across kv for faster TG #19209

am17an merged 3 commits into ggml-org:master from am17an:opt-fa-decode
am17an
am17an am17an requested a review from ggerganov ggerganov 28 days ago
github-actions github-actions added ggml
ggerganov
am17an
am17an am17an changed the title ggml-cpu: split across kv for faster TG ggml-cpu: FA split across kv for faster TG 27 days ago
am17an
JohannesGaessler
am17an
JohannesGaessler
ggerganov
am17an
ggerganov
am17an
ggerganov
am17an
ggerganov
github-actions github-actions added testing
am17an am17an force pushed from 332f0766 to efe83e1e 25 days ago
am17an am17an force pushed from efe83e1e to 88c5fa61 25 days ago
ggerganov
ggerganov
ggerganov approved these changes on 2026-02-02
am17an ggml-cpu: split across kv for faster TG
353c85f2
am17an simplify sinks application
8c19a423
am17an add ref impl
3849170c
am17an am17an force pushed from 88c5fa61 to 3849170c 25 days ago
am17an am17an merged 9f682fb6 into master 25 days ago
am17an am17an deleted the opt-fa-decode branch 25 days ago
Djip007
am17an
Djip007
am17an
am17an
Djip007

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone