[OSS][Metal] Support Resnet models
Summary:
This diff adds the missing ops to run the Resnet models from Torchvision. Move the tensors to GPU can significantly improve the perf as show below (iPhone11)
Time running on CPU (ms):
```
forward took: 166.115
forward took: 150.722
forward took: 150.383
forward took: 150.345
forward took: 150.761
forward took: 150.533
forward took: 150.588
forward took: 150.812
forward took: 150.925
forward took: 150.25
```
Time running on GPU (ms):
```
forward took: 39.9355
forward took: 41.3531
forward took: 41.798
forward took: 40.4744
forward took: 39.5181
forward took: 42.6464
forward took: 41.2658
forward took: 40.0862
forward took: 42.3533
forward took: 41.9348
```
Discrepancy in result
```
GPU:
"(623, 4.6211)",
"(111, 3.8809)",
"(499, 3.8555)",
"(596, 3.8047)",
"(473, 3.7422)",
"(846, 3.5762)",
"(892, 3.5449)",
"(813, 3.5098)",
"(446, 3.5020)",
"(902, 3.4980)"
CPU:
"(623, 4.4229)",
"(499, 3.8321)",
"(596, 3.6192)",
"(111, 3.5295)",
"(813, 3.4848)",
"(584, 3.3979)",
"(418, 3.3357)",
"(473, 3.2760)",
"(846, 3.2745)",
"(902, 3.2376)"
```
Test Plan: {F340824316}
Reviewed By: IvanKobzarev
Differential Revision: D24416294
fbshipit-source-id: 12c9199ade0b76a7aa8a3838eddc4c19c79b6f37