pytorch
0335222a - memory efficient fq: use it everywhere, delete the old version (#51159)

Commit

4 years ago

memory efficient fq: use it everywhere, delete the old version (#51159) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51159 This PR is the cleanup after #50561. High level, we make the new definition of fake_quant be the definition used by autograd, but keep the old function around as a thin wrapper to keep the user facing API the same. In detail: 1. point `fake_quantize_per_tensor_affine`'s implementation to be `fake_quantize_per_tensor_affine_cachemask` 2. delete the `fake_quantize_per_tensor_affine` backward, autograd will automatically use the cachemask backward 3. delete all the `fake_quantize_per_tensor_affine` kernels, since they are no longer used by anything Test Plan: ``` python test/test_quantization.py TestFakeQuantize ``` performance testing was done in the previous PR. Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26090869 fbshipit-source-id: fda042881f77a993a9d15dafabea7cfaf9dc7c9c

Author

vkuzo

Committer

facebook-github-bot

Parents

983b8e6b

pytorch 0335222a - memory efficient fq: use it everywhere, delete the old version (#51159)

pytorch
0335222a - memory efficient fq: use it everywhere, delete the old version (#51159)