Make broadcast_object_list accept a device parameter. (#61305)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61305
Part I (this PR): Add dist_device argument to broadcast_object_list API
Part II: andwgu@ will deprecate _broadcast_object with the newly introduced API
Also include the changes to _object_to_tensor()/_tensor_to_object() with PR 60573
Context: https://github.com/pytorch/pytorch/issues/60062
Test Plan:
Run the following on DevGpus with two cuda devices
$python setup.py develop --- run this build on DevGPU
$BACKEND='nccl' WORLD_SIZE=2 with-proxy python test/distributed/test_distributed_fork.py TestDistBackendWithFork.test_broadcast_object_list --v
$BACKEND='gloo' WORLD_SIZE=2 with-proxy python test/distributed/test_distributed_fork.py TestDistBackendWithFork.test_broadcast_object_list --v
Build with distributed on: USE_DISTRIBUTE=1 python setup.py develop
Test on CPU devvm:
$ with-proxy python test/distributed/optim/test_zero_redundancy_optimizer.py
Imported from OSS
Differential Revision:
D29566538
D29566538
Reviewed By: iramazanli, mrshenli
Pulled By: bowangbj
fbshipit-source-id: 0bea52442551c5194acba85eadda16ba2ec4b6ef