[reland][c10d] monitored_barrier: ensure all ranks pass or none do (#55990)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55990
Reland of https://github.com/pytorch/pytorch/pull/55197, which fails windows test that was only run on master.
Disabled these tests for windows, similar to they are disabled on MacOS. The reason for disabling as that they use libuv transport which does not have as robust error handling as tcp on linux. The result is that non-zero ranks that were healthy don't throw immediately (like they do on linux) but they throw on timeout. The error handling still occurs as expected on rank 0 for all platforms.
ghstack-source-id: 126478371
Test Plan: CI
Reviewed By: zhaojuanmao
Differential Revision: D27758424
fbshipit-source-id: d30841c8dda77f51b09a58161e638657ef758e63