[NCCL] Explicitly Abort NCCL Communicators on Process Group Destruction (#40241)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40241
We abort incomplete NCCL Communicators in the ProcessGroupNCCL
destructor, otherwise pending NCCL communciators may block other CUDA ops.
Closes: https://github.com/pytorch/pytorch/issues/32231
ghstack-source-id: 106469423
Test Plan: CI/Sandcastle
Reviewed By: jiayisuse
Differential Revision: D22103662
fbshipit-source-id: 1f6f88b56bd7a5e9ca5a41698995a76e60e8ad9f