Fixes regnet and resnest models (#535)
Summary:
# Regnety_016 model
## Eval
### Batch size analysis
<google-sheets-html-origin>
Batch Size | GPU Time | CPU Dispatch Time | Walltime | GPU Delta
-- | -- | -- | -- | --
1 | 20.033 | 19.876 | 20.036 | -
2 | 19.903 | 19.698 | 19.91 | -0.006489292667
4 | 20.383 | 19.949 | 20.388 | 0.02411696729
8 | 26.036 | 20.438 | 26.043 | 0.2773389589
16 | 41.678 | 22.1 | 41.689 | 0.6007835305
32 | 70.706 | 20.977 | 70.716 | 0.6964825567
64 | 123.295 | 21.309 | 123.333 | 0.7437699771
128 | 232.932 | 19.394 | 232.981 | 0.8892250294
256 | 450.733 | 20.206 | 450.76 | 0.9350411279
### Idleness analysis

## Train
### Batch size analysis
<google-sheets-html-origin>
Batch Size | GPU Time | CPU Dispatch Time | Walltime | GPU Delta
-- | -- | -- | -- | --
1 | 18.152 | 17.953 | 18.157 | -
2 | 18.244 | 17.854 | 18.25 | 0.005068312032
4 | 20.58 | 17.555 | 20.586 | 0.128042096
8 | 25.618 | 18.979 | 25.625 | 0.2448007775
16 | 41.393 | 19.225 | 41.402 | 0.6157779686
32 | 70.589 | 24.852 | 70.603 | 0.7053366511
64 | 123.592 | 20.022 | 123.614 | 0.7508676989
128 | 232.986 | 19.86 | 233.063 | 0.8851220144
256 | 451.207 | 20.508 | 451.252 | 0.9366270935
### Idleness analysis
Profiling result of best bs=32:

Pull Request resolved: https://github.com/pytorch/benchmark/pull/535
Reviewed By: aaronenyeshi
Differential Revision: D32066975
Pulled By: xuzhao9
fbshipit-source-id: 545714f0f0074055eeef582978420e8115dba0d9