SemanticDiff

pytorch
4be8fe0f - [thread_pg] fix all_reduce to respect different cuda device (#107151)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

1 year ago

[thread_pg] fix all_reduce to respect different cuda device (#107151) The previous implementation only works on CPU and it does not respect the fact that each rank have its data in different devices (i.e. cuda), so the implementation will raise the error like below: ``` RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! ``` See report in https://github.com/pytorch/pytorch/pull/105604#issuecomment-1675472670 This PR fix this issue and tested that the failed tests on GPU now works Pull Request resolved: https://github.com/pytorch/pytorch/pull/107151 Approved by: https://github.com/kumpera

Author

wanchaol

wanchaol

Committer

pytorchmergebot

pytorchmergebot

Parents

FAQ Terms Privacy Refunds Impressum

Loading