pytorch
ab6c5721 - Add NCCL PreMul Sum to c10d `redce` ops (#84243)

Commit
2 years ago
Add NCCL PreMul Sum to c10d `redce` ops (#84243) This is based on #81272 but this conforms to TorchScript Compiler ## TODO - [ ] Update https://github.com/pytorch/pytorch/blob/abaf8112e6d6bed2a5d33dcbc1d46ed20b8e80de/torch/csrc/distributed/c10d/ProcessGroupUCC.cpp#L64-L73 to use `ReduceOp::RedOpType`. In my first try with `USE_SYSTEM_UCC=1`, this change wasn't necessary (I think) because of `ReduceOp::RedOpType` operator. That being said, I want to make it more explicit. cc @ptrblck @kwen2501 @aazzolini cc @zasdfgbnm for visibility to the TODO above Pull Request resolved: https://github.com/pytorch/pytorch/pull/84243 Approved by: https://github.com/kwen2501
Author
Committer
Parents
Loading