pytorch
f371b526 - Made max_pool2d_with_indices_backward_cuda contiguify `indices` (#85493)

Commit

2 years ago

Made max_pool2d_with_indices_backward_cuda contiguify `indices` (#85493) Currently, max_pool2d_with_indices_backward(grad_output, self, ..., indices) (on cuda) assumes that indices has the same suggested memory format as self. This is indeed always true in regular PyTorch: the max_pool2d_with_indices forward pass returns indices with the same suggeted memory format as self. However, we'd like to make an argument that always contiguifying indices is good for consistency, has negligible added cost, and is more robust (for Tensor Subclass authors): - the max_pool2d_with_indices_backward implementation for CPU always contiguifies `indices`. Ditto for the max_pool3d_with_indices_backward implementation. - Calling .contiguous() has almost no cost (compared to before) because there is a fast-path that checks the cached memory_format on the TensorImpl. - functorch has trouble writing a batching rule for `max_pool2d_with_indices_backward`. Having it accept `indices` with arbitrary strides helps make it so that vmap doesn't need to special case the batching rule for the strides of `indices`. Test Plan: - Not sure if it's worth writing a separate test case - this PR fixes one of functorch's test cases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85493 Approved by: https://github.com/ezyang

Author

zou3519

Committer

pytorchmergebot

Parents

ea72a099

pytorch f371b526 - Made max_pool2d_with_indices_backward_cuda contiguify `indices` (#85493)

pytorch
f371b526 - Made max_pool2d_with_indices_backward_cuda contiguify `indices` (#85493)