dbr quant: split observer insertion to a separate pass (#71253)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71253
Before this PR, observers were inserted at the same time as
we recorded ops seen while tracing with example input. This is not
ideal because for function fusion (not yet implemented),
we need to be able to look ahead from the current op to properly
insert observers.
This PR refactors observer insertion in DBR quantization to happen
in a separate pass after the ops are recorded. There is no functionality
change in this diff, but this PR will make it easier to implement
function fusion in a future PR.
Note: the qconfig is still used during tracing to assign each
op an inference dtype. This is not ideal, in the future we may move this
step to happen as a separate pass as well. The reason we keep it as is
in this PR because some more refactoring would be needed to allow
this to both happen in a separate pass as well as survive module
boundaries.
Test Plan:
```
python test/test_quantization.py -k DBR
```
Reviewed By: wenleix
Differential Revision: D33558280
Pulled By: vkuzo
fbshipit-source-id: 54e9cea6ad05317a8c7c92be005d33653617bed6
(cherry picked from commit 2985849916dbd194b6bf44cc3c360e9450da6828)