Remove the construction of unused tensors (#79183)
Hi there, this statement, [`auto columns = at::empty({nInputPlane * kW * kH, outputHeight * outputWidth}, input.options());`](, will construct a new tensor and allocate device memory for it. But I found this tensor will be only used([line185]( and [line197]( when [`requires_columns`]( is true.
So we can declare an `at::Tensor columns;` variable(This will not allocate device memory for `columns`), and invoke `at::empty` to construct the tensor when `requires_columns` is true. As for the statement [`int64_t n = columns.size(1);`](, the size can be calculated by the arguments of function `slow_conv2d_forward`.
I profiled the resnet50 in [`pytorch/benchmark`]( I found there are lots of unused tensors in the device memory and they are gone after my optimization. It also works well on vgg16, yolov3, alexnet, etc.
Pull Request resolved:
Approved by: