llama.cpp
cfd3be76 - ggml : same IQ4_NL quantization for CPU/CUDA/Metal (#6196)

Commit
1 year ago
ggml : same IQ4_NL quantization for CPU/CUDA/Metal (#6196) * Make quantize_row_iq4_nl do the same thing is quantization on CUDA * Make quantize_row_iq4_nl do the same thing is quantization on CUDA This time for real. backend-ops tests pass. * Now fix test-quantize-fns --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Author
Parents
Loading