llama.cpp
904d2a8d - Q4_1 quantization (#193)

Commit
2 years ago
Q4_1 quantization (#193) * Add AVX2 version of ggml_vec_dot_q4_1 * Small optimisations to q4_1 dot product (@Const-me) * Rearrange Q4_1 quantization to work for multipart models. (Fix #152) * Fix ggml_vec_mad_q4_1 too * Fix non-vectorised q4_1 vec mul
Author
Parents
Loading