Fix shared_ptr binary size in op registration (#26869)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26869
Having a lot of shared_ptr<Functor> cost us ~1.1MB of binary size in libtorch.so.
This PR fixes that.
ghstack-source-id: 90842812
Test Plan: measure libtorch.so size
Differential Revision: D17595674
fbshipit-source-id: 05151047ee8e85c05205b7510a33915ba98bab58