reorganize test binaries of op bench (#30023)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30023
This diff doesn't change how users run the benchmarks. But under the hood, we group all the tests into three groups: unary test, quantized test, and the rest ops (we name it others here).
Test Plan:
```
buck run //caffe2/benchmarks/operator_benchmark:benchmark_all_test -- --iterations 1
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short
# Benchmarking PyTorch: abs
# Mode: Eager
# Name: abs_M512_N512_cpu
# Input: M: 512, N: 512, device: cpu
Forward Execution Time (us) : 17914.301
...
# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M64_N64_K64_cpu_bwd2
# Input: M: 64, N: 64, K: 64, device: cpu
Backward Execution Time (us) : 66525.855
...
# Benchmarking PyTorch: mul
# Mode: Eager
# Name: mul_N2_dtypetorch.qint32_contigTrue
# Input: N: 2, dtype: torch.qint32, contig: True
Forward Execution Time (us) : 290.555
...
Reviewed By: hl475
Differential Revision: D18574719
fbshipit-source-id: f7ff1d952031129adde51ebf002e4891bd484680