Propagate ProcessGroup timeout to Store (#16571)
Summary:
closes #16520
Hi pietern, I am not sure if this is the expected way to pass timeout to `Store`, could you please help take a look? Thanks!
Questions:
1. How do I write tests for this? I wanted to do something like `test_barrier_timeout_global`, but it seems I need to set the pg's timeout larger than the `Store`'s default timeout (3 min) to see a difference, which is too long for a unit test. And I do not want to change the `Store`'s default timeout either. Any suggestion?
2. Should I also propagate timeout configuration down to `PrefixStore` in `_new_process_group_helper`?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16571
Differential Revision: D13954527
Pulled By: mrshenli
fbshipit-source-id: 77f2653903f24255207233eb298f7c0321119a87