onnx export of per channel fake quantize functions (#42835) (#52430)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/39502
This PR adds support for exporting **fake_quantize_per_channel_affine** to a pair of QuantizeLinear and DequantizeLinear. Per tensor support was added by PR https://github.com/pytorch/pytorch/pull/39738.
`axis` attribute of QuantizeLinear and DequantizeLinear, which is required for per channel support, is added in opset13 added by https://github.com/onnx/onnx/pull/2772.
[update 1/20/2021]: opset13 is being supported on master, the added function is now properly tested. Code also rebased to new master.
The function is also tested offline with the following code
```python
import torch
from torch import quantization
from torchvision import models
qat_resnet18 = models.resnet18(pretrained=True).eval().cuda()
qat_resnet18.qconfig = quantization.QConfig(
activation=quantization.default_fake_quant, weight=quantization.default_per_channel_weight_fake_quant)
quantization.prepare_qat(qat_resnet18, inplace=True)
qat_resnet18.apply(quantization.enable_observer)
qat_resnet18.apply(quantization.enable_fake_quant)
dummy_input = torch.randn(16, 3, 224, 224).cuda()
_ = qat_resnet18(dummy_input)
for module in qat_resnet18.modules():
if isinstance(module, quantization.FakeQuantize):
module.calculate_qparams()
qat_resnet18.apply(quantization.disable_observer)
qat_resnet18.cuda()
input_names = [ "actual_input_1" ]
output_names = [ "output1" ]
torch.onnx.export(qat_resnet18, dummy_input, "quant_model.onnx", verbose=True, opset_version=13)
```
It can generate the desired graph.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42835
Reviewed By: houseroad
Differential Revision: D26293823
Pulled By: SplitInfinity
fbshipit-source-id: 300498a2e24b7731b12fa2fbdea4e73dde80e7ea
Co-authored-by: Hao Wu <skyw@users.noreply.github.com>