[NCCL] Tests for WorkNCCL::wait with Timeouts (#40947)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40947
This PR adds tests for work-level timeouts in WorkNCCL objects. We kick off an allgather operation that waits for 1000ms before actually starting computation. We wait on completion of this allgather op with a timeout of 250ms, expecting the operation to timeout and throw a runtime error.
ghstack-source-id: 107835734
Test Plan: This diff added tests - checking CI/Sandcastle for correctness. These are NCCL tests so they require at least 2 GPUs to run.
Reviewed By: jiayisuse
Differential Revision: D22173101
fbshipit-source-id: 8595e4b67662cef781b20ced0befdcc53d157c39