llama.cpp
e435bfd9 - RMSE-optimized quants for all quantization types

Commit
2 years ago
RMSE-optimized quants for all quantization types By default this new option is ON. One can turn it off by setting LLAMA_NO_RMSE. With this option enabled, the Q4_3 quantization results in a perplexity of 6.0344, so 0.0273 lower than simple Q4_3 quantization.
Author
Committer
Parents
  • File
    CMakeLists.txt
  • File
    Makefile
  • File
    ggml.c