llama.cpp
d50f8897
- CUDA: stream-k decomposition for MMQ (#8018)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
CUDA: stream-k decomposition for MMQ (#8018) * CUDA: stream-k decomposition for MMQ * fix undefined memory reads for small matrices
References
#8018 - CUDA: stream-k decomposition for MMQ
Author
JohannesGaessler
Parents
2075a66a
Loading