pytorch
86af14b0 - Resolves ptxas warnings when compiling for CUDA_ARCH 750 and a memoryType deprecation warning (#15461)

Commit View On GitHub

Commit

5 years ago

Resolves ptxas warnings when compiling for CUDA_ARCH 750 and a memoryType deprecation warning (#15461) Summary: When compiling for `TORCH_CUDA_ARCH_LIST=7.5` we were getting ptxas warnings (https://github.com/pytorch/pytorch/issues/14310). This was because we had some hardcoded values when using launch_bounds in kernels. The maximum number of threads per multiprocessor is 1024 for Turing architecture (7.5) but 2048 for previous architectures. The hardcoded launch_bounds in the kernel were requesting for 2048 threads when compiling for Turing and hence were generating the warning. This PR adds a macro that checks for the bounds on the launch bounds value supplied. The max number of threads per block across all architectures is 1024. If a user supplies more than 1024, I just clamp it down to 512. Depending on this value, I set the minimum number of blocks per sm. This PR should resolve https://github.com/pytorch/pytorch/issues/14310. The gradient computation being wrong reported in that PR is probably due to the faulty card. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15461 Differential Revision: D13633952 Pulled By: soumith fbshipit-source-id: 795aa151109f343ab5433bf3cb070cb6ec896fff

Author

syed-ahmed

Committer

facebook-github-bot

Parents

07ea3e03

pytorch 86af14b0 - Resolves ptxas warnings when compiling for CUDA_ARCH 750 and a memoryType deprecation warning (#15461)

Commit

pytorch
86af14b0 - Resolves ptxas warnings when compiling for CUDA_ARCH 750 and a memoryType deprecation warning (#15461)