pytorch
515238e0 - Unify cudaGetDeviceCount implementations. (#18445)

Commit View On GitHub

Commit

5 years ago

Unify cudaGetDeviceCount implementations. (#18445) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18445 ghimport-source-id: 30d018737bf6989bc68b7e3676f44e0ca6141fde Stack from [ghstack](https://github.com/ezyang/ghstack): * #18242 Test running a CUDA build on CPU machine. * **#18445 Unify cudaGetDeviceCount implementations.** I went about doing this by searching for calls to cudaGetDeviceCount, and then methodically replacing them with references to c10::cuda::device_count() or at::cuda::device_count(). There is a point to doing this: the various implementations wildly differed in their handling of what to do when cudaGetDeviceCount returns an error. The final standardized behavior is that **all errors are swallowed** and we return device count of zero. This indirectly fixes running CUDA builds on CPU, which was broken in #17847. I added 'noexcept' to the 'deviceCount' virtual method on DeviceGuardImpl. This is a BC-breaking change for anyone inheriting from DeviceGuardImpl but all you need to do is put 'noexcept' on your method and it is backwards compatible with older libtorch. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14612189 fbshipit-source-id: 3c8d186e3dd623c0e27625212c7ce30f75d943cb

Author

ezyang

Committer

facebook-github-bot

Parents

cf094d4e

pytorch 515238e0 - Unify cudaGetDeviceCount implementations. (#18445)

Commit

pytorch
515238e0 - Unify cudaGetDeviceCount implementations. (#18445)