pytorch
7818e7e5 - Basic framework for Distributed Autograd context. (#24875)

Commit View On GitHub

Commit

5 years ago

Basic framework for Distributed Autograd context. (#24875) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24875 As per https://github.com/pytorch/pytorch/issues/23110, each autograd pass would be assigned a unique autograd_context_id. In this change we introduce a DistAutogradContainer per worker which holds information for each autograd pass currently running. DistAutogradContainer has a map from the autograd_context_id to DistAutogradContext (which holds all the relevant information for the autograd pass). DistAutogradContext currently only stores the autograd_context_id and more information would be added to it later as we build out the rest of the framework. The autograd_context_id is a 64 bit globally unique integer where the first 16 bits are the worker_id and next 48 bits are auto-incrementing for uniqueness. Sample python code on how this would be used for distributed autograd: ``` import torch.distributed.autograd as dist_autograd worker_id = 0 dist_autograd.init(worker_id) with dist_autograd.context() as context_id: # forward pass... # backward pass... # optimizer step... ``` ghstack-source-id: 89119248 Test Plan: unit tests. Differential Revision: D16356694 fbshipit-source-id: d1a8678da0c2af611758dbb5d624d554212330ce

Author

pritamdamania

Committer

facebook-github-bot

Parents

8e189a32

pytorch 7818e7e5 - Basic framework for Distributed Autograd context. (#24875)

Commit

pytorch
7818e7e5 - Basic framework for Distributed Autograd context. (#24875)