llama.cpp
129d844c - Fix Q4_K and Q5_K for QK_K = 64 on CUDA (#2359)

Commit
1 year ago
Fix Q4_K and Q5_K for QK_K = 64 on CUDA (#2359) * Fix Q4_K and Q5_K for QK_K = 64 * Very slightly better Q5_K bit fiddling --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Author
Parents
  • File
    ggml-cuda.cu