llama.cpp
b2acedeb
- cuda : add F32 -> Q4_0 and F32 -> Q4_1 copy kernels
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
cuda : add F32 -> Q4_0 and F32 -> Q4_1 copy kernels
References
#4312 - llama : support quantum K cache
Author
ggerganov
Parents
e8457c90
Loading