unify matmul benchmark (#28899)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28899
as title
Test Plan:
```
buck run mode/opt //caffe2/benchmarks/operator_benchmark/pt:matmul_test
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short
# Benchmarking PyTorch: matmul
# Mode: Eager
# Name: matmul_M128_N128_K128_trans_aTrue_trans_bFalse_cpu
# Input: M: 128, N: 128, K: 128, trans_a: True, trans_b: False, device: cpu
Forward Execution Time (us) : 39.535
Reviewed By: hl475
Differential Revision: D18228271
fbshipit-source-id: 681ed2745c25a122997346a23acdbc67e55e5ec4