fix operator level benchmark to have NHWC layout (#26577)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26577
Have the NHWC layout expected by qconv kernel.
for rexnext101-32x4d shapes
Before :
```
Forward Execution Time (us) : 4787.046
Forward Execution Time (us) : 1320.065
Forward Execution Time (us) : 2611.631
Forward Execution Time (us) : 2562.389
Forward Execution Time (us) : 1072.342
Forward Execution Time (us) : 2330.658
Forward Execution Time (us) : 1894.549
Forward Execution Time (us) : 3446.532
Forward Execution Time (us) : 2381.251
Forward Execution Time (us) : 1157.339
Forward Execution Time (us) : 2712.621
Forward Execution Time (us) : 3789.905
Forward Execution Time (us) : 4057.886
Forward Execution Time (us) : 6104.570
Forward Execution Time (us) : 11328.552
Forward Execution Time (us) : 3707.519
Forward Execution Time (us) : 4681.272
Forward Execution Time (us) : 2459.266
Forward Execution Time (us) : 849.564
Forward Execution Time (us) : 3000.764
Forward Execution Time (us) : 3019.704
Forward Execution Time (us) : 5216.046
Forward Execution Time (us) : 3403.549
Forward Execution Time (us) : 1291.878
Forward Execution Time (us) : 2057.147
```
After
```
Forward Execution Time (us) : 4398.649
Forward Execution Time (us) : 993.619
Forward Execution Time (us) : 2252.265
Forward Execution Time (us) : 2230.500
Forward Execution Time (us) : 977.389
Forward Execution Time (us) : 2233.356
Forward Execution Time (us) : 1223.085
Forward Execution Time (us) : 2758.765
Forward Execution Time (us) : 2208.028
Forward Execution Time (us) : 821.816
Forward Execution Time (us) : 2396.748
Forward Execution Time (us) : 2505.803
Forward Execution Time (us) : 2771.251
Forward Execution Time (us) : 4816.474
Forward Execution Time (us) : 10065.299
Forward Execution Time (us) : 2424.949
Forward Execution Time (us) : 3854.800
Forward Execution Time (us) : 2297.426
Forward Execution Time (us) : 682.403
Forward Execution Time (us) : 2297.541
Forward Execution Time (us) : 2317.828
Forward Execution Time (us) : 4517.372
Forward Execution Time (us) : 2716.691
Forward Execution Time (us) : 942.385
Forward Execution Time (us) : 1717.172
```
ghstack-source-id: 90536232
Test Plan: buck build mode/opt caffe2/benchmarks/operator_benchmark/pt:qconv_test --show-output
Differential Revision: D17512291
fbshipit-source-id: 7764b2ab38e0e8e0aab982006915176638004df6