pytorch
cd207737 - Set CUDA arch correctly when building with torch.utils.cpp_extension (#23408)

Commit View On GitHub

Commit

5 years ago

Set CUDA arch correctly when building with torch.utils.cpp_extension (#23408) Summary: The old behavior was to always use `sm_30`. The new behavior is: - For building via a setup.py, check if `'arch'` is in `extra_compile_args`. If so, don't change anything. - If `TORCH_CUDA_ARCH_LIST` is set, respect that (can be 1 or more arches) - Otherwise, query device capability and use that. To test this, for example on a machine with `torch` installed for py37: ``` $ git clone https://github.com/pytorch/extension-cpp.git $ cd extension-cpp/cuda $ python setup.py install $ cuobjdump --list-elf build/lib.linux-x86_64-3.7/lltm_cuda.cpython-37m-x86_64-linux-gnu.so ELF file 1: lltm.1.sm_61.cubin ``` Existing tests in `test_cpp_extension.py` for `load_inline` and for compiling via `setup.py` in test/cpp_extensions/ cover this. Closes gh-18657 EDIT: some more tests: ``` from torch.utils.cpp_extension import load lltm = load(name='lltm', sources=['lltm_cuda.cpp', 'lltm_cuda_kernel.cu']) ``` ``` # with TORCH_CUDA_ARCH_LIST undefined or an empty string $ cuobjdump --list-elf /tmp/torch_extensions/lltm/lltm.so ELF file 1: lltm.1.sm_61.cubin # with TORCH_CUDA_ARCH_LIST = "3.5 5.2 6.0 6.1 7.0+PTX" $ cuobjdump --list-elf build/lib.linux-x86_64-3.7/lltm_cuda.cpython-37m-x86_64-linux-gnu.so ELF file 1: lltm_cuda.cpython-37m-x86_64-linux-gnu.1.sm_35.cubin ELF file 2: lltm_cuda.cpython-37m-x86_64-linux-gnu.2.sm_52.cubin ELF file 3: lltm_cuda.cpython-37m-x86_64-linux-gnu.3.sm_60.cubin ELF file 4: lltm_cuda.cpython-37m-x86_64-linux-gnu.4.sm_61.cubin ELF file 5: lltm_cuda.cpython-37m-x86_64-linux-gnu.5.sm_70.cubin ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/23408 Differential Revision: D16784110 Pulled By: soumith fbshipit-source-id: 69ba09e235e4f906b959fd20322c69303240ee7e

Author

rgommers

Committer

facebook-github-bot

Parents

02dd9a40

pytorch cd207737 - Set CUDA arch correctly when building with torch.utils.cpp_extension (#23408)

Commit

pytorch
cd207737 - Set CUDA arch correctly when building with torch.utils.cpp_extension (#23408)