This can be useful, e.g., for low-bit quantization where the experience has shown that one can improve the model by changing the f_rms_norm parameter. Instead of having to specify the metadata override each time the model is used, with this PR one can encode the override during quantization using
This can be useful, e.g., for low-bit quantization where the experience has shown that one can improve the model by changing the
f_rms_norm
parameter. Instead of having to specify the metadata override each time the model is used, with this PR one can encode the override during quantization usingThe
--override-kv
argument can be repeated multiple times.