Fix NCCL version check when nccl.h in non-standard location. (#40982)
Summary:
The NCCL discovery process fails to compile detect_nccl_version.cc when nccl.h resides in a non-standard location.
Pass __NCCL_INCLUDE_DIRS__ to _try_run(... detect_nccl_version.cc)_ to fix this.
Can reproduce with Dockerfile ..
```Dockerfile
FROM nvidia/cuda:10.2-cudnn7-devel-ubuntu18.04 as build
WORKDIR /stage
# install conda
ARG CONDA_VERSION=4.7.10
ARG CONDA_URL=https://repo.anaconda.com/miniconda/Miniconda3-${CONDA_VERSION}-Linux-x86_64.sh
RUN cd /stage && curl -fSsL --insecure ${CONDA_URL} -o install-conda.sh &&\
/bin/bash ./install-conda.sh -b -p /opt/conda &&\
/opt/conda/bin/conda clean -ya
ENV PATH=/opt/conda/bin:${PATH}
# install prerequisites
RUN conda install numpy ninja pyyaml mkl mkl-include setuptools cmake cffi
# attempt compile
ENV CUDA_HOME="/usr/local/cuda" \
CUDNN_LIBRARY="/usr/lib/x86_64-linux-gnu" \
NCCL_INCLUDE_DIR="/usr/local/cuda/include" \
NCCL_LIB_DIR="/usr/local/cuda/lib64" \
USE_SYSTEM_NCCL=1
RUN apt-get -y update &&\
apt-get -y install git &&\
cd /stage && git clone https://github.com/pytorch/pytorch.git &&\
cd pytorch &&\
git submodule update --init --recursive &&\
python setup.py bdist_wheel
```
This generates the following error ..
```
-- Found NCCL: /usr/local/cuda/include
-- Determining NCCL version from /usr/local/cuda/include/nccl.h...
-- Looking for NCCL_VERSION_CODE
-- Looking for NCCL_VERSION_CODE - found
CMake Error at cmake/Modules/FindNCCL.cmake:78 (message):
Found NCCL header version and library version do not match! (include:
/usr/local/cuda/include, library: /usr/local/cuda/lib64/libnccl.so) Please
set NCCL_INCLUDE_DIR and NCCL_LIB_DIR manually.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40982
Reviewed By: zou3519
Differential Revision: D22603911
Pulled By: malfet
fbshipit-source-id: 084870375a270fb9c7daf3c2e731992a03614ad6