Remove "NotImplemented" flags if the test is limited by hardware capacity (#781)
Summary:
When a test is flagged as "NotImplemented", there are actually two cases:
1. The test itself doesn't implement or handle the configs, e.g., unsupervised-learning models like pytorch_struct doesn't have `eval()` tests, or the pyhpc models don't have `train()` tests.
2. The test doesn't support running on our T4 CI GPU machine, but it runs totally fine on other GPUs, such as `V100` or `A100`.
This PR is to eliminate the second case, so that the test can still run through `run.py` or `run_sweep.py` interfaces. Instead, we flag the test to be `not_implemented` in the `metadata.yaml`, and the CI script `test.py` or `test_bench.py` will read from the metadata and determine they are not suitable to run on the CI machine.
This fixes https://github.com/pytorch/benchmark/issues/688, https://github.com/pytorch/benchmark/issues/626, and https://github.com/pytorch/benchmark/issues/598
Pull Request resolved: https://github.com/pytorch/benchmark/pull/781
Reviewed By: aaronenyeshi
Differential Revision: D34786277
Pulled By: xuzhao9
fbshipit-source-id: d5d3d884839345f4fcad21ccf541a02d8e705f5f