Implement `enable_amp` for vision & huggingface models (#1330)
Summary:
Pull Request resolved: https://github.com/pytorch/benchmark/pull/1330
Tested with:
```
python run.py [hf_T5_large|resnet50] -t train -d cuda --precision [fp32|amp] --torchdynamo inductor --skip_correctness
```
* resnet50 w/ fp32: 37ms
* resnet50 w/ amp: 25ms
* hf_T5_large w/ fp32: 418ms
* hf_T5_large w/ amp: 127ms
I run with `--skip_correctness` to avoid a correctness check that's failing due to dtype mismatch. The failure also occurs for timm, where `enable_amp` is already implemented.
Also verified that these 2 models do not error out for `-t eval`.
Test Plan: Imported from OSS
Reviewed By: xuzhao9
Differential Revision: D41620520
Pulled By: davidberard98
fbshipit-source-id: d3d3dcfb786391e20bbe32975170b77f1efe02e7