[PyTorch] Devirtualize TensorImpl::numel() with macro (#49766)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49766
Devirtualizing this seems like a decent performance improvement on
internal benchmarks.
The *reason* this is a performance improvement is twofold:
1) virtual calls are a bit slower than regular calls
2) virtual functions in `TensorImpl` can't be inlined
Test Plan: internal benchmark
Reviewed By: hlu1
Differential Revision: D25602321
fbshipit-source-id: d61556456ccfd7f10c6ebdc3a52263b438a2aef1