pytorch
eb4d43df - Make CUDA triu / tril support batches of size > 65535 (#21067)

Commit
6 years ago
Make CUDA triu / tril support batches of size > 65535 (#21067) Summary: In the previous implementation of triu / tril, we passed the batch size in the 2nd dimension of a grid. This is limited to 65535, which means that performing triu / tril on a tensor with batch size > 65535 will throw an error. This PR removes the dependence on the 2nd dimension, and corresponding non-contiguity constraints. Changelog: - Compute offset, row and col in the kernel - Use 1st dimension of grid alone - Remove unnecessary contiguity checks on tensors as a result of this change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21067 Differential Revision: D15572501 Pulled By: ezyang fbshipit-source-id: 93851cb661918ce794d43eeb12c8a38762e1358c
Author
Parents
Loading