llama.cpp
469e75d0 - llama : restore intended k-quants mixes for MoE models (#4872)

Commit

1 year ago

llama : restore intended k-quants mixes for MoE models (#4872) * Restore intended k-quants quantization mixes for MoE models * Update Q2_K_S values in the quantize tool Still using LLaMA-v1 PPL values in the quant description today does not make much sense. But let's leave this update for another PR. --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

References

#4872 - Restore intended k-quants quantization mixes for MoE models

Author

ikawrakow

Parents

49662cbe

Files3

examples/quantize
- quantize.cpp
llama.cpp
llama.h

llama.cpp 469e75d0 - llama : restore intended k-quants mixes for MoE models (#4872)

llama.cpp
469e75d0 - llama : restore intended k-quants mixes for MoE models (#4872)