text-generation-inference
dbb23fbf - Use symmetric quantization in the `quantize` subcommand (#2120)

Commit
1 year ago
Use symmetric quantization in the `quantize` subcommand (#2120) Packing of asymmetric quantization is broken, all (q)zeros values of `0` get reset to `1`, resulting in a loss of accuracy. So instead use symmetric quantization. To be able to distinguish models with symmetric and asymmetric quantization, a new config tensor `gptq_sym` is added. If this tensor is not present, we assume `sym=False`.
Author
Parents
Loading