Invoke more passes in `insertObservers` (#30473)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30473
Invoked `ConstantPooling` and `FuseLinear` pass before
`insertObservers`.
`ConstantPooling` is for cleanning up traced graph, e.g. when we
have to constant node that has the same value, this pass will merge them,
this allows us to have less quantization patterns
`FuseLinear` is to merge the exploded linear function into `aten::linear` so
that we can quantize this function properly. We need to fuse it because right now
the way we recognize weight and bias is by matching the argument position in certain function
calls, e.g. 1st argument of aten::conv2d is weight. Therefore we have to preserve
the bounary of the linear function to recognize the weight of linear. Since in the exploded
linear code, input of addmm is transposed weight rather than the original weight of linear.
ghstack-source-id: 94887831
Test Plan:
This is needed for quantizing traced model tests to pass
Imported from OSS
Differential Revision: D18795722
fbshipit-source-id: 192d9d1e56307e2e1d90e30dce0502e31cb4f829