pytorch
0d47374c - construct only necessary elements in OffsetCalculator (#55107)

Commit
4 years ago
construct only necessary elements in OffsetCalculator (#55107) Summary: Per title. Elements beyond `dim` are never accessed because https://github.com/pytorch/pytorch/blob/646510f7028f12e8b1f3a9d3b63b8519ed80e391/aten/src/ATen/cuda/detail/OffsetCalculator.cuh#L49-L51. On `addmm` instruction count per 30 repetitions 1467813 -> 1452261 `add` 651522 -> 633462 `add_` 529331 -> 511271 add benchmarking snippet: ``` timer = Timer("m1.add_(b);", setup="at::Tensor m1=torch::empty({2,2},device(at::kCUDA) ); at::Tensor b = torch::empty({2}, device(at::kCUDA));", language="c++", timer=timeit.default_timer) stats=timer.collect_callgrind(number=30) print(stats.as_standardized().stats(inclusive=False)) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/55107 Reviewed By: swolchok Differential Revision: D27494492 Pulled By: ngimel fbshipit-source-id: 23389a6bc9c9c0096751b95e7f9bf1c9f7bc594f
Author
Natalia Gimelshein
Parents
Loading