dist.init_process_group only if it's not already init'd (#1499)
Summary:
Otherwise, initializing the model twice in the same python process will fail with
```
Traceback (most recent call last):
File "/fsx/users/janeyx/conda/envs/torchbenchmark/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/fsx/users/janeyx/conda/envs/torchbenchmark/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/scratch/janeyx/work/benchmark/userbenchmark/optim/__init__.py", line 406, in <module>
run(sys.argv[1:])
File "/scratch/janeyx/work/benchmark/userbenchmark/optim/__init__.py", line 397, in run
results = run_benchmarks(args.optims, args.funcs, args.models, args.devices)
File "/scratch/janeyx/work/benchmark/userbenchmark/optim/__init__.py", line 336, in run_benchmarks
bm = run_model(mn, d, O, defaults, func_str)
File "/scratch/janeyx/work/benchmark/userbenchmark/optim/__init__.py", line 313, in run_model
raise e
File "/scratch/janeyx/work/benchmark/userbenchmark/optim/__init__.py", line 288, in run_model
params = get_model_params(modelName, device)
File "/scratch/janeyx/work/benchmark/userbenchmark/optim/__init__.py", line 240, in get_model_params
params = _get_model_params(Model(device=device, test='train'))
File "/scratch/janeyx/work/benchmark/torchbenchmark/util/model.py", line 20, in __call__
obj = type.__call__(cls, *args, **kwargs)
File "/scratch/janeyx/work/benchmark/torchbenchmark/models/torchrec_dlrm/__init__.py", line 46, in __init__
dist.init_process_group(backend=backend)
File "/fsx/users/janeyx/conda/envs/torchbenchmark/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 131, in wrapper
return func(*args, **kwargs)
File "/fsx/users/janeyx/conda/envs/torchbenchmark/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line 902, in init_process_group
raise RuntimeError("trying to initialize the default process group " "twice!")
```
For optim benchmarking, we will init the model twice (once for cpu, once for cuda) and run into this error. This shouldn't cause failures.
Pull Request resolved: https://github.com/pytorch/benchmark/pull/1499
Reviewed By: xuzhao9
Differential Revision: D44274202
Pulled By: janeyx99
fbshipit-source-id: c3c64396cc448fa6f514a29088b72e7b89ae973b