llama.cpp
30eef31b
- Make quantize_row_iq4_nl do the same thing is quantization on CUDA
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
Make quantize_row_iq4_nl do the same thing is quantization on CUDA This time for real. backend-ops tests pass.
References
#6196 - Make IQ4_NL quantization be the same on CPU/CUDA/Metal when quantizing K-cache
Author
Iwan Kawrakow
Parents
cd4a7c4c
Loading