[NCCL] Support Wait Timeout in ProcessGroupNCCL (#40946)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40946
Adds timeout to ProcessGroupNCCL::wait. Currently, WorkNCCL objects already have a timeout set during ProcessGroupNCCL construction. The new wait function will override the existing timeout with the user-defined timeout if one is provided. Timed out operations result in NCCL communicators being aborted and an exception being thrown.
ghstack-source-id: 107835739
Test Plan: Test added to `ProcessGroupNCCLTest` in the next PR in this stack.
Reviewed By: jiayisuse
Differential Revision: D22127898
fbshipit-source-id: 543964855ac5b41e464b2df4bb6c211ef053e73b