QAT ConvBN: remove explicit folding and use BN instead (#38478)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38478
Before this PR, the QAT ConvBN module inlined the batch normalization code
in order to reproduce Conv+BN folding.
This PR updates the module to use BN directly. This is mathematically
equivalent to previous behavior as long as we properly scale
and fake quant the conv weights, but allows us to reuse the BN code
instead of reimplementing it.
In particular, this should help with speed since we can use dedicated
BN kernels, and also with DDP since we can hook up SyncBatchNorm.
Test Plan:
```
python test/test_quantization.py TestQATModule
```
Imported from OSS
Differential Revision: D21603230
fbshipit-source-id: ecf8afdd833b67c2fbd21a8fd14366079fa55e64