llama.cpp
1425f587
- CUDA: attention sinks for mma FlashAttention (#15157)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
30 days ago
CUDA: attention sinks for mma FlashAttention (#15157)
References
#15157 - CUDA: attention sinks for mma FlashAttention
Author
JohannesGaessler
Parents
aaa3d07a
Loading