Fix flaky test_nccl_timeout. (#32653)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32653
This test was flaky since the watchdog thread could abort the
communicator instead of the thread calling `wait()`. As a result, we could
actually see `NCCL error` instead of `Operation timed out` on the user end.
ghstack-source-id: 97250714
Test Plan: waitforbuildbot
Differential Revision: D19583003
fbshipit-source-id: 5c07326d1a16f214dcdbabed97ca613e0a5b42b9