llama.cpp
CUDA: fix FlashAttention on Turing
#13415
Merged

CUDA: fix FlashAttention on Turing #13415

JohannesGaessler
JohannesGaessler CUDA: fix FlashAttention on Turing
6fe0f09c
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
Dampfinchen
slaren
slaren approved these changes on 2025-05-09
JohannesGaessler JohannesGaessler merged d8919424 into master 248 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone