llama.cpp
CUDA: re-use MLA K data for V in MMA FA
#19057
Merged

CUDA: re-use MLA K data for V in MMA FA #19057

JohannesGaessler
JohannesGaessler CUDA: re-use MLA K data for V in MMA FA
f5cfe168
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
ggerganov
ggerganov approved these changes on 2026-01-24
JohannesGaessler JohannesGaessler merged 8f91ca54 into master 5 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone