[inductor] fix cpu implementation of argmax / argmin (#94165)
Fixes #94055
When the reduction numel equals to 1, inner function of argmax / argmin is `return 0`. This inner function losts the data type of `0`, which may result in conflicting types for subsequent calculations. This PR keeps the data type in inner function.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94165
Approved by: https://github.com/jgong5, https://github.com/Neilblaze, https://github.com/jansel