Add qconv_test to benchmarking tests (#24913)
Summary:
Adds the tests defined in `qconv_tests.py` to `benchmark_all_tests.py` so that they are ran by `benchmark_all_tests`.
The next diff will create another `ai_benchmark_test` specifying the qconv operations similar to D16768680. Since AI-PEP integrates with benchmark_all_tests, this should add these qconv benchmarks to AI-PEP.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24913
Test Plan:
`buck run mode/opt caffe2/benchmarks/operator_benchmark:benchmark_all_test` (runs only test who's `tag` is `short`)
`buck run mode/opt caffe2/benchmarks/operator_benchmark:benchmark_all_test -- --tag_filter resnext101_32x4d` (runs test who's `tag` is `resxnet101_32x4d`).
This runs the tests for all the imported modules in `benchmark_all_test.py` (i.e. add_test, batchnorm_test, qconv_test, etc)
```
buck run mode/opt caffe2/benchmarks/operator_benchmark:benchmark_all_test -- --operators QConv2d,QLinear
```
tests the QConv and QLinear operators
Relevant output for `qconv_test.py` (for short tag):
```
# Benchmarking PyTorch: QConv2d
# Mode: Eager
# Name: QConv2d_N1_IC64_OC128_H56_W56_G1_kernel1_stride1_pad0
# Input: N: 1, IC: 64, OC: 128, H: 56, W: 56, G: 1, kernel: 1, stride: 1, pad: 0
Forward Execution Time (us) : 957.848
# Benchmarking PyTorch: QConv2d
# Mode: Eager
# Name: QConv2d_N1_IC256_OC256_H56_W56_G32_kernel3_stride1_pad1
# Input: N: 1, IC: 256, OC: 256, H: 56, W: 56, G: 32, kernel: 3, stride: 1, pad: 1
Forward Execution Time (us) : 3638.806
# Benchmarking PyTorch: QConv2d
# Mode: Eager
# Name: QConv2d_N1_IC256_OC256_H56_W56_G1_kernel1_stride1_pad0
# Input: N: 1, IC: 256, OC: 256, H: 56, W: 56, G: 1, kernel: 1, stride: 1, pad: 0
Forward Execution Time (us) : 3870.311
# Benchmarking PyTorch: QConv2d
# Mode: Eager
# Name: QConv2d_N1_IC512_OC512_H56_W56_G32_kernel3_stride2_pad1
# Input: N: 1, IC: 512, OC: 512, H: 56, W: 56, G: 32, kernel: 3, stride: 2, pad: 1
Forward Execution Time (us) : 10052.192
```
For resnext tag:
```
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : resnext101_32x4d
# Benchmarking PyTorch: QConv2d
# Mode: Eager
# Name: QConv2d_N1_IC512_OC512_H14_W14_G32_kernel3_stride1_pad1
# Input: N: 1, IC: 512, OC: 512, H: 14, W: 14, G: 32, kernel: 3, stride: 1, pad: 1
Forward Execution Time (us) : 543.171
# Benchmarking PyTorch: QConv2d
# Mode: Eager
# Name: QConv2d_N1_IC512_OC1024_H28_W28_G1_kernel1_stride2_pad0
# Input: N: 1, IC: 512, OC: 1024, H: 28, W: 28, G: 1, kernel: 1, stride: 2, pad: 0
Forward Execution Time (us) : 1914.301
# Benchmarking PyTorch: QConv2d
# Mode: Eager
# Name: QConv2d_N1_IC512_OC256_H28_W28_G1_kernel1_stride1_pad0
# Input: N: 1, IC: 512, OC: 256, H: 28, W: 28, G: 1, kernel: 1, stride: 1, pad: 0
Forward Execution Time (us) : 1809.069
# Benchmarking PyTorch: QConv2d
# Mode: Eager
# Name: QConv2d_N1_IC512_OC512_H28_W28_G1_kernel1_stride1_pad0
# Input: N: 1, IC: 512, OC: 512, H: 28, W: 28, G: 1, kernel: 1, stride: 1, pad: 0
Forward Execution Time (us) : 3100.579
# Benchmarking PyTorch: QConv2d
# Mode: Eager
# Name: QConv2d_N1_IC512_OC512_H28_W28_G32_kernel3_stride2_pad1
# Input: N: 1, IC: 512, OC: 512, H: 28, W: 28, G: 32, kernel: 3, stride: 2, pad: 1
Forward Execution Time (us) : 2247.540
# Benchmarking PyTorch: QConv2d
# Mode: Eager
# Name: QConv2d_N1_IC64_OC128_H56_W56_G1_kernel1_stride1_pad0
# Input: N: 1, IC: 64, OC: 128, H: 56, W: 56, G: 1, kernel: 1, stride: 1, pad: 0
Forward Execution Time (us) : 1001.731
# Benchmarking PyTorch: QConv2d
# Mode: Eager
# Name: QConv2d_N1_IC64_OC256_H56_W56_G1_kernel1_stride1_pad0
# Input: N: 1, IC: 64, OC: 256, H: 56, W: 56, G: 1, kernel: 1, stride: 1, pad: 0
Forward Execution Time (us) : 1571.620
```
Differential Revision: D16908445
Pulled By: rohan-varma
fbshipit-source-id: b711bc3591ce5dcd3ab2521134cff2b12188e3ac