Skip ProcessGroupNCCLTest if CUDA is not available (#28393)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28393
We should skip this test if CUDA is not available and alert the user.
Previously, if this test was ran on cpu it would fail with:
```
terminate called after throwing an instance of 'std::runtime_error'
what(): cuda runtime error (3) : This binary is linked with CUDA lazy stubs and underlying .so files were not loaded. CUDA functionality is disabled. Set env variable CUDA_LAZY_DEBUG to get messages during startup
```
Test Plan:
Build on CPU and verify that that are no errors when running, we should get the message:
`CUDA not available, skipping test`. Previously, we would get an error:
```
terminate called after throwing an instance of 'std::runtime_error'
what(): cuda runtime error (3) : This binary is linked with CUDA lazy stubs and underlying .so files were not loaded. CUDA functionality is disabled. Set env variable CUDA_LAZY_DEBUG to get messages during startup. at caffe2/aten/src/THC/THCGeneral.cpp:54
```
Differential Revision: D18054369
fbshipit-source-id: f1d06af88b780a24ca3373a7a133047a2cfe366e