pytorch
c5af0afd - catch exceptions in ProcessGroupAgent::enqueueSend and report them. (#31023)

Commit
5 years ago
catch exceptions in ProcessGroupAgent::enqueueSend and report them. (#31023) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31023 Adds support to catch exceptions in ProcessGroupAgent::enqueueSend and report them in the future by marking the future as completed with an exception indicating the error. An example of when this could happen is if the receiving side aborts when the sender is sending the message, previously, we would hang until the timeout is hit, and the original exception would be lost. ghstack-source-id: 96498386 Test Plan: Added a relevant unit test: `test_sender_exceptions` in rpc_test.py Differential Revision: D18901981 fbshipit-source-id: 08de26936c4ad45b837219a247088cbea644c04c
Author
Parents
Loading