pytorch
c7183c98 - Fix object-based collectives API to use torch.cuda.current_device instead of (#46897)

Commit

4 years ago

Fix object-based collectives API to use torch.cuda.current_device instead of (#46897) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46897 These APIs implicitly assumed that gpu for rank == rank index, but that is not necessarily true. For example, the first GPU could be used for a different purpose and rank 0 could use GPU 1, rank 1 uses GPU 2, etc. Thus, we mandate that the user specify the device to use via `torch.cuda.set_device()` before making calls to this API. This expectation should be okay since we clearly document it, and we expect the user to set this for DistributedDataParallel as well. Also adds/tidies up some documentation. ghstack-source-id: 115359633 Test Plan: Modified unittests Reviewed By: divchenko Differential Revision: D24556177 fbshipit-source-id: 7e826007241eba0fde3019180066ed56faf3c0ca

Author

rohan-varma

Committer

facebook-github-bot

Parents

dc817635

pytorch c7183c98 - Fix object-based collectives API to use torch.cuda.current_device instead of (#46897)

pytorch
c7183c98 - Fix object-based collectives API to use torch.cuda.current_device instead of (#46897)