[PyTorch] avoid unnecessary call to empty_tensor_restride in empty() (#48211)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48211
Our empty benchmark makes this call unconditionally. If
MemoryFormat::Contiguous is indeed a common case (or if workloads are
likely to use a consistent-ish memory format), then I'd expect
checking first to be a win.
ghstack-source-id: 118224990
Test Plan:
Profiled empty benchmark with perf, saw time spent in empty_tensor_restride go down.
Ran framework overhead benchmarks. ~7% win on empty(), 0.5-1.5% regression on InPlace, ~2% win on OutOfPlace. Seems like both the In/Out of place ones are likely to be noise because they don't exercise empty?
Reviewed By: bhosmer
Differential Revision: D24914706
fbshipit-source-id: 916771b335143f9b4ec9fae0d8118222ab6e8659