pytorch
ccac8d13 - [3/N] [Dispatchable Collectives] Update broadcast_ with CPU and CUDA implementations (#83735)

Commit
2 years ago
[3/N] [Dispatchable Collectives] Update broadcast_ with CPU and CUDA implementations (#83735) ### About this PR * Update the broadcast op to dispatch to cpu and cuda implementations. Right now they both perform the same logic so this is essentially a no-op. * Add test to validate that a separate device implementation is not supported. ### About this stack In the future we will repurpose ProcessGroup to instead contain a list of Backends (ProcessGroupNCCL/Gloo/UCC) and perform dispatching to them based on tensor type. The CPU and CUDA implementations will be updated to have process group select its CPU and CUDA backends respectively. Differential Revision: [D38876771](https://our.internmc.facebook.com/intern/diff/D38876771) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83735 Approved by: https://github.com/kwen2501
Author
Committer
Parents
Loading