[Tensorpipe Agent] Timeouts for RPC requests (#38448)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38448
This PR implements timeout support for RPCs, and respects the new per-RPC timeout functionality.
A map containing RPC futures, keyed by an expiration time, is populated by the send function for each RPC.
A separate watchdog thread polls this map and sets all incomplete futures with errors.
Note: we cannot set errors to a future with the lock held (this will trigger callbacks immediately and, if one of the callback functions tries to acquire the lock that we held when setting the error, we have a lock order cycle). Thus we add all incomplete futures to a list, and then iterate through the list outside the lock to set errors on those futures if necessary.
ghstack-source-id: 104227075
Test Plan: Will patch the testing diff on top of this to run tests.
Differential Revision: D21468526
fbshipit-source-id: 4514484ece6fb6be673427d44c7f3164ab3d9d7c