Use c10::irange and fix other index types in ForeachReduceOp.cu (#121123)
This PR follows the suggestions in #121066 and changes most loops to c10::irange.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121123
Approved by: https://github.com/soulitzer