[iOS GPU][Kernel] Implement mean.dim using MPSReduce kernel (#56073)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56073
Implement the `mean.dim` operator for Metal backend. Currently, we don't support reducing the batch dimension.
ghstack-source-id: 126802129
Test Plan:
- Sandcastle
- CircleCI
- Unit tests
```
2021-03-23 13:01:29.663842-0700 PyTorchPlayground[64572:9575354] [bool test_mean_dim()],[1 5 2 2 ],[SUCCEED]
2021-03-23 13:01:29.666230-0700 PyTorchPlayground[64572:9575354] [bool test_mean_dim2()],[1 5 2 2 ],[SUCCEED]
```
Reviewed By: dhruvbird
Differential Revision: D27269394
fbshipit-source-id: fafcdde50ac457a8488c6170d0a8d3db1871439b