[inductor] Added bucketize to decomp table (#88348)
These are the benchmark results vs eager
```
[--------------------------- bucketize ----------------------------]
| eager | decomp
32 threads: --------------------------------------------------------
((16384, 1024), (16,)), (True, True) | 600 | 464
((16384, 1024), (16,)), (True, False) | 542 | 464
((16384, 1024), (16,)), (False, True) | 780 | 731
((16384, 1024), (16,)), (False, False) | 777 | 731
((16384, 1024), (64,)), (True, True) | 624 | 515
((16384, 1024), (64,)), (True, False) | 603 | 515
((16384, 1024), (64,)), (False, True) | 789 | 718
((16384, 1024), (64,)), (False, False) | 786 | 718
((16384, 1024), (256,)), (True, True) | 878 | 820
((16384, 1024), (256,)), (True, False) | 891 | 830
((16384, 1024), (256,)), (False, True) | 897 | 900
((16384, 1024), (256,)), (False, False) | 900 | 900
((16384, 1024), (1024,)), (True, True) | 2000 | 1890
((16384, 1024), (1024,)), (True, False) | 1950 | 1892
((16384, 1024), (1024,)), (False, True) | 1990 | 1962
((16384, 1024), (1024,)), (False, False) | 1990 | 2060
((16384, 1024), (4096,)), (True, True) | 3405 | 3155
((16384, 1024), (4096,)), (True, False) | 3244 | 3154
((16384, 1024), (4096,)), (False, True) | 3282 | 3219
((16384, 1024), (4096,)), (False, False) | 3278 | 3220
((16384, 1024), (16384,)), (True, True) | 4626 | 4672
((16384, 1024), (16384,)), (True, False) | 4629 | 4671
((16384, 1024), (16384,)), (False, True) | 4662 | 4829
((16384, 1024), (16384,)), (False, False) | 4665 | 4824
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88348
Approved by: https://github.com/ngimel