reduce input shapes of long tag in op bench (#29865)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29865
For some operators, the number of tests (forward + backward) could easily go above 100. Many of them could be redundant so this diff tries to reduce the number of shapes.
Test Plan:
```
buck run //caffe2/benchmarks/operator_benchmark:benchmark_all_test -- --iterations 1
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short
# Benchmarking PyTorch: add
# Mode: Eager
# Name: add_M64_N64_K64_cpu
# Input: M: 64, N: 64, K: 64, device: cpu
Forward Execution Time (us) : 28418.926
...
Reviewed By: hl475
Differential Revision: D18520946
fbshipit-source-id: 1056d6d5a9c46bc2d508ff133039aefeb9d11c27