text-generation-inference
dbb23fbf - Use symmetric quantization in the `quantize` subcommand (#2120)

Commit

1 year ago

Use symmetric quantization in the `quantize` subcommand (#2120) Packing of asymmetric quantization is broken, all (q)zeros values of `0` get reset to `1`, resulting in a loss of accuracy. So instead use symmetric quantization. To be able to distinguish models with symmetric and asymmetric quantization, a new config tensor `gptq_sym` is added. If this tensor is not present, we assume `sym=False`.

References

#2120 - Use symmetric quantization in the `quantize` subcommand

Author

danieldk

Parents

c46eaf70

text-generation-inference dbb23fbf - Use symmetric quantization in the `quantize` subcommand (#2120)

text-generation-inference
dbb23fbf - Use symmetric quantization in the `quantize` subcommand (#2120)