Run 15 rounds and collect the max delta (#1352)
Summary:
Improve the stableness check so that it is closer to how nightly job runs
Test workflow (A100, CUDA, train,eval): https://github.com/pytorch/benchmark/actions/runs/3760244112
Test workflow 2 (A100, CUDA, train,eval): https://github.com/pytorch/benchmark/actions/runs/4008483390
Test workflow 3 (CPU, train, eval): https://github.com/pytorch/benchmark/actions/runs/4027564709
Test workflow 4 (CPU, train, eval):
Test workflow 5 (T4, CUDA, train, eval):
Test workflow 6 (T4, CUDA, train, eval):
Pull Request resolved: https://github.com/pytorch/benchmark/pull/1352
Reviewed By: weiwangmeta
Differential Revision: D42746997
Pulled By: xuzhao9
fbshipit-source-id: 403f658361a2553bc3def9538a91fe669228704b