RMSE-optimized quants for all quantization types
By default this new option is ON. One can turn it off
by setting LLAMA_NO_RMSE.
With this option enabled, the Q4_3 quantization results
in a perplexity of 6.0344, so 0.0273 lower than simple
Q4_3 quantization.