llama.cpp
11d4e099 - iq3_s: PPL improvement

Commit
1 year ago
iq3_s: PPL improvement E.g., for a context of 4096 LLaMA-v2-7B goes to 5.1340 from 5.1653.
Author
Parents
  • File
    ggml-cuda.cu
  • File
    ggml-quants.c