llama.cpp
955ef9a5 - ggml : alternative Q4_3 implementation using modified Q8_0 (#1109)

Commit
2 years ago
ggml : alternative Q4_3 implementation using modified Q8_0 (#1109) * ggml : prefer vzip to vuzp This way we always use the same type of instruction across all quantizations * ggml : alternative Q4_3 implementation using modified Q8_0 * ggml : fix Q4_3 scalar imlpementation * ggml : slight improvement of Q4_3 - no need for loop unrolling * ggml : fix AVX paths for Q8_0 quantization
Author
Parents
Loading