Optimize size(dim) and stride(dim)

Commit

2 years ago

Optimize size(dim) and stride(dim) This improves `c10::maybe_wrap_dim` to short-cut the "happy path" where dim is in the correct range, and also moves the error and scalar edge-cases out-of-line. These changes cut callgrind instruction counts for `size(i)` from 5200 to 2000. In the `size` and `stride` methods themselves, I also avoid calling `TensorImpl::dim()` since it may be a virtual call. This further reduced the instruction count from 2000 to 1500. For comparison, `tensor.sizes()[0]` takes 1200 instructions so `tensor.size(0)` is still marginally slower. This is unavoidable though since it has to handle dimension wrapping. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75416 Approved by: https://github.com/Lezcano, https://github.com/ngimel

Author

peterbell10

Committer

pytorchmergebot

Parents

03dd22a2

pytorch e6842c0e - Optimize size(dim) and stride(dim)

pytorch
e6842c0e - Optimize size(dim) and stride(dim)