pytorch
e7e6d56b - Allow async work in rpc RequestCallback processing. (#30637)

Commit View On GitHub

Commit

4 years ago

Allow async work in rpc RequestCallback processing. (#30637) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30637 RequestCallback api currently forces work to be always synchronous, which, as we scale, means we're going to need to throw large number of (mostly blocked) threads at the rpc problem. For some activities like dependent autograd rpcs, there's not a necessary reason to block in these threads. In this change, the RequestCallback api is updated to return a shared_ptr<FutureMessage> rather than a Message: std::shared_ptr<FutureMessage> operator()(Message& request) const; With a futures-style api, RPC ops that wish to be async can then be async, while short-lived blocking functions (or Python UDFs) can just block. In this change, we keep all of the current ops as synchronous (i.e. we block and then return a completed FutureMessage). We also update the rpc_agents in a manner compatible with this sort of parallelism. Here, we only want to incur overhead when we use the async behavior. Some modest extra cost seems unavoidable here (e.g. the allocation for the std::make_shared<>), but we can trivially detect the synchronous/completed case in the rpc_agent and avoid the extra thread-switches/etc. in that case. ghstack-source-id: 95287026 Test Plan: - Basic: buck test mode/dev-nosan caffe2/test/... - Additional testcase in ThriftRpcAgentTest for deferred work. Differential Revision: D18774322 fbshipit-source-id: cf49922a71707cfb1726de16f93af23b160385d8

Author

jjlilley

Committer

facebook-github-bot

Parents

e42af973

pytorch e7e6d56b - Allow async work in rpc RequestCallback processing. (#30637)

Commit

pytorch
e7e6d56b - Allow async work in rpc RequestCallback processing. (#30637)