llama.cpp
dd86df82
- metal : use mm kernel only for quantum KV cache
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
metal : use mm kernel only for quantum KV cache
References
#4312 - llama : support quantum K cache
Author
ggerganov
Parents
903167a7
Loading