ddp_experiments script: add nccl-socket-ifname arg, fix timeout (#1489)
Summary:
Fixes: https://github.com/pytorch/benchmark/issues/1486.
The timeout flag was being set incorrectly. This fixes that.
The NCCL_SOCKET_IFNAME was not configurable on the command line. This adds a --nccl-socket-ifname arg.
Testing: manually tested with:
```
python userbenchmark/ddp_experiments/__init__.py --ngpus 8 --distributed ddp --nodes 1 --filter_models resnet50 --timeout 15 --nccl-socket-ifname asdf
```
* verified that the slurm job gets killed after ~15 minutes
* in the logs, I observe `NCCL INFO NCCL_SOCKET_IFNAME set to asdf` followed by `NCCL WARN Bootstrap : no socket interface found`
Pull Request resolved: https://github.com/pytorch/benchmark/pull/1489
Reviewed By: xuzhao9
Differential Revision: D44153156
Pulled By: davidberard98
fbshipit-source-id: 6a4d5911e8ab0ef0243c20d268e56f3247df091c