llama.cpp
CUDA: add attention sinks for tile and wmma
#15178
Merged

CUDA: add attention sinks for tile and wmma #15178

am17an
am17an CUDA: add attention sinks for tile and wmma
4946c199
am17an am17an requested a review from JohannesGaessler JohannesGaessler 33 days ago
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
JohannesGaessler
JohannesGaessler commented on 2025-08-09
am17an Review: formatting changes + remove syncthreads from tile + remove wa…
1ef7fd00
am17an
JohannesGaessler
JohannesGaessler approved these changes on 2025-08-09
am17an am17an merged 34c9d765 into master 32 days ago
am17an am17an deleted the cuda_fattn_tile_wmma branch 32 days ago
IMbackK

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone