Make ProcessGroupAgent take num_send_recv_threads as constructor argument (#26313)
Summary:
# Problem
If there is not enough number of thread in the RPC Agent thread pool. Some circular dependent works could cause deadlock.
The current to way to get around this deadlock is to provide abundant number of threads.
# Solution
as titled
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26313
Differential Revision: D17405491
Pulled By: xush6528
fbshipit-source-id: a1d9b6a84db0371cd4b63328fa00f651c0808485