Add faulty tensorpipe implementation (#61421)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61421
This PR adds the faulty tensorpipe agent implementation and replaces all faulty process group agent tests with it. The faulty tensorpipe agent code is very similar to that of faulty process group agent. It allows the user to fail or delay certain types of rpc messages, which is used in the faulty agent tests. These changes are needed to deprecate the process group rpc backend.
Summary of changes:
- Add faulty tensorpipe agent class
- Update tensorpipe pipeWrite function to allow to be overwritten and add delay
- Update test backend registry and faulty agent tests to use the FAULTY_TENSORPIPE_AGENT backend.
This effects all faulty agent tests, here a few of them as sample commands:
`pytest test/distributed/rpc/test_faulty_agent.py -vs -k test_verify_backend_options`
`pytest test/distributed/rpc/test_faulty_agent.py -vs -k test_no_faulty_messages`
`pytest test/distributed/rpc/test_faulty_agent.py -vs -k test_builtin_remote_message_dropped_timeout`
Test Plan: Imported from OSS
Reviewed By: mrshenli
Differential Revision: D29773739
Pulled By: H-Huang
fbshipit-source-id: 6b2bc366735d70b79943d4207f454bc9555bbf5f