llama.cpp
129d844c
- Fix Q4_K and Q5_K for QK_K = 64 on CUDA (#2359)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Minimap (CTRL+M)
Commit
1 year ago
Fix Q4_K and Q5_K for QK_K = 64 on CUDA (#2359) * Fix Q4_K and Q5_K for QK_K = 64 * Very slightly better Q5_K bit fiddling --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
References
#2359 - Fix Q4_K and Q5_K for QK_K = 64 on CUDA
Author
ikawrakow
Parents
d5512b78
Files
1
ggml-cuda.cu
Loading