llama.cpp
9725a313
- CUDA: reduce MMQ stream-k overhead (#22298)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
22 days ago
CUDA: reduce MMQ stream-k overhead (#22298) * CUDA: reduce MMQ stream-k overhead * use 32 bit integers for kbc
References
#22298 - CUDA: reduce MMQ stream-k overhead
Author
JohannesGaessler
Parents
d1649047
Loading