Allow PyTorch to be built without NCCL (#17295)
Summary:
With this patch you can use USE_DISTRIBUTED=OFF (possibly in combination with USE_NCCL=OFF (?))
The significance is partly because the NCCL doesn't build with CUDA 8.
This is written under the assumption that NCCL is required for distributed if not, the USE_DISTRIBUTED check in nccl.py should be replaced by a check for the USE_NCCL environment variable.
Fixes: #17274
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17295
Differential Revision: D14155080
Pulled By: ezyang
fbshipit-source-id: 0d133f7c5b4d118849f041bd4d4cbbd7ffc3c7b4