Optimize LeftRight and either (#25133)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25133
This is driven by benchmarks I did for moving ATen ops to the c10 operator library.
Improvements:
- tell the compiler that the error cases are unlikely so it can optimize code better
- optimize cache layout of LeftRight.
ghstack-source-id: 88907294
Test Plan: unit tests
Differential Revision: D16998010
fbshipit-source-id: 0e3cbff0a4983133a4447ec093444f5d85dd61d6