llama.cpp
0c21677e
- CUDA: faster FA for GQA > 1 but not power of 2 (#19092)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
27 days ago
CUDA: faster FA for GQA > 1 but not power of 2 (#19092)
References
#19092 - CUDA: faster FA for GQA > 1 but not power of 2
Author
JohannesGaessler
Parents
0440bfd1
Loading