[c10d] Add flag value for direct teardown without comm abort (#102599)
It was recently reported that `ncclCommAbort` itself may hang in some NCCL versions. For example, https://github.com/NVIDIA/nccl/issues/829.
In that case, it may be desirable to directly tear down the program without properly aborting the NCCL communicator, so that user does not wait for hours before noticing a hang.
This PR adds new value 3 for env `NCCL_ASYNC_ERROR_HANDLING` that skips the comm abort, and directly throws error in case of exception (timeout, async error, etc)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102599
Approved by: https://github.com/fegin