[7/N] [Dispatchable Collectives] Update reduce with CPU / CUDA implementations (#83916)
### Changes
- Updates for the reduce collective
### Context
https://github.com/pytorch/pytorch/issues/86225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83916
Approved by: https://github.com/kwen2501