Allocate empty tensor instead of empty_like in binary ops, fix pow (#26498)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26498
We should allocate an empty tensor as a result tensor when performing
binary ops. Currently some ops use `empty_like(self)` as the initial
result tensor before passing it into TensorIterator. This is not very
efficient because TensorIterator may resize the tensor due to
broadcasting, causing more memory allocation. By using an empty tensor
as the result tensor, we only need to allocate/resize memory once as
opposed to twice.
Also fixes https://github.com/pytorch/pytorch/issues/26495. The bug
there is that the implementation of `pow` is missing a resize in one
case.
Test Plan:
- new test
- run tests
Differential Revision: D17500025
Pulled By: zou3519
fbshipit-source-id: bff4949af5e75541c04669b961bcf2e1ec456faf