Improved aten::to performance from inline cvr remote_request_only (#53800)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53800
copy_impl improvement:
Before: 1732 ns
After:. 1159 ns
remote_request_only
Before: Milliseconds per iter: 1.24185. Iters per second: 805.252
0.161477 ms. 13.5036%. aten::to (155 nodes)
After: Milliseconds per iter: 1.14195. Iters per second: 875.696
0.113893 ms. 10.339%. aten::to (155 nodes)
Test Plan: buck test caffe2:ATen-core-test
Reviewed By: ajyu
Differential Revision: D26967349
fbshipit-source-id: d8f8dc5e8e3df1cec57fa098b21119ec9568e4a5