[PyTorch] Reduce copy/move in c10::ivalue::from (#52324)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52324
`c10::ivalue::from` took its parameter by value. `List` has
an expensive move ctor (it has to copy the type shared_ptr) and dtor
(it has to decref the type, which isn't null), so it's better to avoid
intermediate List objects in function parameters.
ghstack-source-id: 121807292
Test Plan:
Profiled AdIndexer benchmark; time spent in push_outputs is
down from 0.5% to 0.23%.
Comparing assembly for
`c10::impl::push_outputs<c10::List<at::Tensor>, false>::call`, we went
from 4 List move ctor calls and 5 ~intrusive_ptr calls to 2 move ctor
calls and 3 dtor calls, respectively.
Reviewed By: bhosmer
Differential Revision: D26471093
fbshipit-source-id: 7b2c5e8d391a428f2b4d895717a43123c8d7a054