llama.cpp
CUDA: use mma FA kernel for gqa > 4 on RTX 4000
#15035
Merged

CUDA: use mma FA kernel for gqa > 4 on RTX 4000 #15035

JohannesGaessler
JohannesGaessler CUDA: use mma FA kernel for gqa > 4 on RTX 4000
069d410b
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
ggerganov
ggerganov approved these changes on 2025-08-02
JohannesGaessler JohannesGaessler merged 03d46982 into master 217 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone