llama.cpp
Support requantizing models instead of only allowing quantization from 16/32bit
#1691
Merged

Loading