Error Handling in RPC Agent (#35263)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35263
Process Group Agent throws an exception if a send attempt is made after the agent is shutdown. With retries, we should catch this exception and mark the original future with an error.
ghstack-source-id: 102153897
Test Plan: Running all rpc/dist_autograd tests.
Differential Revision: D20611412
fbshipit-source-id: a6009f0b0aa8be662364158962a054c5c29090bf