Add nvfuser support for torchvision models (#744)
Summary:
Run command:
```
python run.py mnasnet1_0 -d cuda -t eval
GPU Time: 12.160 milliseconds
CPU Dispatch Time: 12.070 milliseconds
CPU Total Wall Time: 12.184 milliseconds
```
```
python run.py mnasnet1_0 -d cuda -t eval --nvfuser fuser1
GPU Time: 11.600 milliseconds
CPU Dispatch Time: 11.305 milliseconds
CPU Total Wall Time: 11.604 milliseconds
```
```
python run.py mnasnet1_0 -d cuda -t eval --nvfuser fuser2
GPU Time: 11.609 milliseconds
CPU Dispatch Time: 11.377 milliseconds
CPU Total Wall Time: 11.610 milliseconds
```
Pull Request resolved: https://github.com/pytorch/benchmark/pull/744
Reviewed By: davidberard98
Differential Revision: D34107295
Pulled By: xuzhao9
fbshipit-source-id: 59e5d8f5d90484eb7ca744ebd5e486a21fbd0bdb