llama.cpp
dd0eabc0 - ggml : use full range for Q4_0 and Q4_2 quantization (#729)

Commit
3 years ago
ggml : use full range for Q4_0 and Q4_2 quantization (#729) * Use full range for q4_0 quantization By keeping the sign of the highest magnitude, we can make sure the highest value maps to -8, which is currently unused. This is a bit of a freebie since it is fully backwards compatible with the current format. * Update quantize_row_q4_0 for AVX/AVX2 * Update quantize_row_q4_0 for WASM Untested * Update quantize_row_q4_0 for Arm NEON * Update quantize_row_q4_0 for PowerPC Untested * Use full range for q4_2 quantization
Author
Parents
Loading