memory efficient fq: use it everywhere, delete the old version (#51159)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51159
This PR is the cleanup after #50561. High level, we make the new
definition of fake_quant be the definition used by autograd, but keep the old
function around as a thin wrapper to keep the user facing API the same.
In detail:
1. point `fake_quantize_per_tensor_affine`'s implementation to be `fake_quantize_per_tensor_affine_cachemask`
2. delete the `fake_quantize_per_tensor_affine` backward, autograd will automatically use the cachemask backward
3. delete all the `fake_quantize_per_tensor_affine` kernels, since they are no longer used by anything
Test Plan:
```
python test/test_quantization.py TestFakeQuantize
```
performance testing was done in the previous PR.
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26090869
fbshipit-source-id: fda042881f77a993a9d15dafabea7cfaf9dc7c9c