llama.cpp
62cef26a - model-conversion : add qat-q4 quantization targets (#15588)

Commit

157 days ago

model-conversion : add qat-q4 quantization targets (#15588) This commit adds two targets to the Makefile for quantizing of Quantization Aware Trained (QAT) models to Q4_0 format. The motivation for this is that this sets the token embedding and the output tensors data types to Q8_0 instead of the default Q6_K. This is someting that we wish to enforce for QAT Q4_0 models that are to be uploaded to ggml-org on Huggingface to guarantee the best quality.

References

#15588 - model-conversion : add qat-q4 quantization targets

Author

danbev

Parents

8f5afa94

llama.cpp 62cef26a - model-conversion : add qat-q4 quantization targets (#15588)

llama.cpp
62cef26a - model-conversion : add qat-q4 quantization targets (#15588)