llama.cpp
129d844c - Fix Q4_K and Q5_K for QK_K = 64 on CUDA (#2359)

Commit

1 year ago

Fix Q4_K and Q5_K for QK_K = 64 on CUDA (#2359) * Fix Q4_K and Q5_K for QK_K = 64 * Very slightly better Q5_K bit fiddling --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

References

#2359 - Fix Q4_K and Q5_K for QK_K = 64 on CUDA

Author

ikawrakow

Parents

d5512b78

Files1

ggml-cuda.cu

llama.cpp 129d844c - Fix Q4_K and Q5_K for QK_K = 64 on CUDA (#2359)

llama.cpp
129d844c - Fix Q4_K and Q5_K for QK_K = 64 on CUDA (#2359)