SemanticDiff

pytorch
4f342a61 - add the worker IDs outside of addSendRpcBackward to ensure they are (#30914)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

4 years ago

add the worker IDs outside of addSendRpcBackward to ensure they are (#30914) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30914 When tensors don't require grad, we don't call `addSendRpcBackward`, where we record known workerIDs to clean up the dist autograd context later. But since https://github.com/pytorch/pytorch/pull/29781, we always include the autograd context ID in RPCs, even if tensors do not require grad. So, it could be possible that we don't release the contexts on some nodes. This can contribute to OOMs since the contexts will not be cleaned up in this case, which can be checking by running the unit test without this patch. We can fix this issue by moving the `addKnownWorkerIds` call to the `getMessageWithAutograd` function. ghstack-source-id: 95178561 Test Plan: Added a unit test: `test_context_cleanup_tensor_no_grad` Differential Revision: D18869191 fbshipit-source-id: b80f66bfd0dd7d01960abe1691d3f44095bb1b2b

Author

rohan-varma

rohan-varma

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading