Fix c10d TCP store with mutex (#68499)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68499
TCP store is actually being accessed by multi-threading (NCCL watch dog thread), but no mutex protection while FileStore and HashStore have. As enabling desync root cause analysis makes store calls more often, the race condition of TCP store was always triggered when creating another process group like gloo. Adding mutex to TCP store, to be the same with FileStore and HashStore.
Test Plan:
DDP benchmark with desync debug enabled, no perf regression
https://www.internalfb.com/intern/fblearner/details/309398285?tab=Outputs
W/o this diff
https://www.internalfb.com/intern/fblearner/details/308379789?tab=Outputs
Reviewed By: mingzhe09088
Differential Revision: D32482254
fbshipit-source-id: e8f466e1c6fdcab6cfa170f44b9be70395935fb8