[PyTorch][JIT] Skip unnecessary refcounting in TensorType::merge (#47959)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47959
Taking a shared_ptr by value incurs refcounting overhead and should only be done if the callee needs to take ownership. Otherwise, `const T&` is more efficient. (Specifically, you will have to do an atomic decrement when the argument is destroyed and probably an atomic increment as well. Passing by `const T&` also takes one less register than passing `std::shared_ptr<T>`, but that's less important.)
This diff fixes just this one function, but I'd be happy to audit & fix this whole file in future diffs. Thoughts?
ghstack-source-id: 116914899
Test Plan: build ATen-cpu
Reviewed By: Krovatkin
Differential Revision: D24970954
fbshipit-source-id: 6bdb4b710a94b8baf4ad63418fb38136134e0ef3