[5/N] [Dispatchable Collectives] Update send with CPU / CUDA implementations (#83859)
Differential Revision: [D40044550](https://our.internmc.facebook.com/intern/diff/D40044550)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83859
Approved by: https://github.com/kwen2501