pytorch
f371b526 - Made max_pool2d_with_indices_backward_cuda contiguify `indices` (#85493)

Commit
2 years ago
Made max_pool2d_with_indices_backward_cuda contiguify `indices` (#85493) Currently, max_pool2d_with_indices_backward(grad_output, self, ..., indices) (on cuda) assumes that indices has the same suggested memory format as self. This is indeed always true in regular PyTorch: the max_pool2d_with_indices forward pass returns indices with the same suggeted memory format as self. However, we'd like to make an argument that always contiguifying indices is good for consistency, has negligible added cost, and is more robust (for Tensor Subclass authors): - the max_pool2d_with_indices_backward implementation for CPU always contiguifies `indices`. Ditto for the max_pool3d_with_indices_backward implementation. - Calling .contiguous() has almost no cost (compared to before) because there is a fast-path that checks the cached memory_format on the TensorImpl. - functorch has trouble writing a batching rule for `max_pool2d_with_indices_backward`. Having it accept `indices` with arbitrary strides helps make it so that vmap doesn't need to special case the batching rule for the strides of `indices`. Test Plan: - Not sure if it's worth writing a separate test case - this PR fixes one of functorch's test cases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85493 Approved by: https://github.com/ezyang
Author
Committer
Parents
Loading