[inductor] Type triton size arguments in the kernel index_dtype (#106870)
`JITFunction._key_of` uses the value of the argument to distinguish between
i32 and i64, but this fails if the value is used in indexing calculations where
the value exceeds `INT_MAX`.
Instead, we should use `index_dtype` which means all indexing calculations are
performed in the same dtype.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106870
Approved by: https://github.com/lezcano
ghstack dependencies: #106626