pytorch
d8d7af08 - Fix CUDA shared memory out of bound access in findPattern (#28989)

Commit
5 years ago
Fix CUDA shared memory out of bound access in findPattern (#28989) Summary: This fixes https://github.com/pytorch/pytorch/issues/28789 Only the first two elements of `smem` are used in this function but at the beginning, it resets all the `C10_WARP_SIZE` to 0. When the `scalar_t` is 64bit, it goes out of the total shared memory size which is `sizeof(int) * C10_WARP_SIZE`, although this does not lead to any failure in CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/28989 Differential Revision: D18271598 Pulled By: ngimel fbshipit-source-id: 38cc863722509892646f719efb05e2730a7d9ae1
Author
Parents
Loading