llama.cpp
CUDA: enable FA for FP32 KV cache
#16546
Merged

CUDA: enable FA for FP32 KV cache #16546

JohannesGaessler
JohannesGaessler CUDA: enable FA for FP32 KV cache
31f2d456
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
ggerganov
ggerganov approved these changes on 2025-10-14
JohannesGaessler JohannesGaessler merged 9c7185dd into master 21 days ago
CISC
JohannesGaessler

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone