llama.cpp
808aba39
- CUDA: optimize and refactor MMQ (#8416)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
CUDA: optimize and refactor MMQ (#8416) * CUDA: optimize and refactor MMQ * explicit q8_1 memory layouts, add documentation
References
#8416 - CUDA: optimize and refactor MMQ
Author
JohannesGaessler
Parents
a977c115
Loading