DeepSpeed
9bf77782 - Fix a bug in the implementation of dequantization for inference (#3433)

Commit

1 year ago

Fix a bug in the implementation of dequantization for inference (#3433) * bugfix in launch_dequantize() Get rid of `hid_cnt` and simply set #blocks to output size / #groups * add a unit test for dequantization --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com>

References

#3433 - Fix a bug in the implementation of dequantization for inference

Author

sakogan

Parents

0c83436a

Files3

csrc/transformer/inference/csrc
- dequantize.cu
- pt_binding.cpp
tests/unit/compression
- test_dequantization.py

DeepSpeed 9bf77782 - Fix a bug in the implementation of dequantization for inference (#3433)

DeepSpeed
9bf77782 - Fix a bug in the implementation of dequantization for inference (#3433)