group quantized op benchmarks into a new binary (#29288)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29288
More quantized operators have been added to the benchmark suite. We want to split them from the un-quantized ones for easier benchmarking.
Test Plan:
```
buck run mode/dev-nosan //caffe2/benchmarks/operator_benchmark:benchmark_all_quantized_test -- --iterations 1
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short
# Benchmarking PyTorch: QConv2d
# Mode: Eager
# Name: QConv2d_kernel3_G32_H56_OC512_N1_stride2_pad1_W56_IC512
# Input: kernel: 3, G: 32, H: 56, OC: 512, N: 1, stride: 2, pad: 1, W: 56, IC: 512
Forward Execution Time (us) : 5614.996
# Benchmarking PyTorch: QLinear
# Mode: Eager
# Name: QLinear_N6400_IN141_OUT15
# Input: N: 6400, IN: 141, OUT: 15
Forward Execution Time (us) : 2829.075
Reviewed By: hl475
Differential Revision: D18349850
fbshipit-source-id: 5b2fd9c1d5a25068592e5059909bb6c14095f397