dbr quantization: inline scale and zp (#68251)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68251
Before this PR, DBR quantization used to recalculate scale and zero_point
in the converted model every time it was needed, which is slow.
This PR creates a pass during the convert function to go through every
observer in the model and cache its scale and zero_point.
Note: only doing this for observers which correspond to int8 operations
is saved for a future PR.
Test Plan:
```
python test/test_quantization.py TestQuantizeDBR
```
Reviewed By: VitalyFedyunin
Differential Revision: D32463769
Pulled By: vkuzo
fbshipit-source-id: d1d2e598e2bccc1958e5023096b451d69dc34e29