[nnc] Tweak log_nnc_sleef so vectorization kicks in (#51491)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51491
The vectorizer heuristic is pretty dumb and only kicks in if the
unroll factor is exactly 8 or 4.
It's still slower than direct implementation, which isn't surprising.
ghstack-source-id: 120783426
Test Plan:
`buck run mode/opt //caffe2/benchmarks/cpp/tensorexpr:tensorexpr_bench`
Before:
```
---------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
---------------------------------------------------------------------------
log_nnc_sleef/64 438 ns 438 ns 1795511 log/s=146.259M/s
log_nnc_sleef/512 3196 ns 3195 ns 210032 log/s=160.235M/s
log_nnc_sleef/8192 77467 ns 77466 ns 8859 log/s=105.749M/s
log_nnc_sleef/32768 310206 ns 310202 ns 2170 log/s=105.634M/s
log_nnc_fast/64 100 ns 100 ns 7281074 log/s=637.144M/s
log_nnc_fast/512 546 ns 546 ns 1335816 log/s=938.361M/s
log_nnc_fast/8192 7360 ns 7359 ns 91971 log/s=1.11316G/s
log_nnc_fast/32768 30793 ns 30792 ns 22633 log/s=1064.17M/s
log_aten/64 427 ns 427 ns 1634897 log/s=150.021M/s
log_aten/512 796 ns 796 ns 877318 log/s=643.566M/s
log_aten/8192 6690 ns 6690 ns 102649 log/s=1.22452G/s
log_aten/32768 25357 ns 25350 ns 27808 log/s=1.29263G/s
```
After:
```
---------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
---------------------------------------------------------------------------
log_nnc_sleef/64 189 ns 188 ns 3872475 log/s=340.585M/s
log_nnc_sleef/512 1307 ns 1307 ns 557770 log/s=391.709M/s
log_nnc_sleef/8192 20259 ns 20257 ns 34240 log/s=404.404M/s
log_nnc_sleef/32768 81556 ns 81470 ns 8767 log/s=402.209M/s
log_nnc_fast/64 110 ns 110 ns 6564558 log/s=581.116M/s
log_nnc_fast/512 554 ns 554 ns 1279304 log/s=923.376M/s
log_nnc_fast/8192 7774 ns 7774 ns 91421 log/s=1053.75M/s
log_nnc_fast/32768 31008 ns 31006 ns 21279 log/s=1056.83M/s
```
Reviewed By: bwasti
Differential Revision: D26139067
fbshipit-source-id: db31897ee9922695ff9dff4ff46e3d3fbd61f4c2