Address temp file/bind race condition in torch_shm_manager (#57309)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57309
Addressing a race condition that can occur in `torch_shm_manager` between the time its temporary file is unlinked and when it `bind()`s the manager server socket to that same name. In that time window, other threads/processes can re-create another temporary file with the same name, causing `bind()` to fail with `EADDRINUSE`.
This diff introduces `c10::TempDir` and associated helper functions that mirror those of `c10::TempFile` and generates the manager socket name using a combination of a temporary directory, which will be valid for the lifetime of `torch_shm_manager`, and a well-known file name within that directory that will never be used outside of `bind()`.
Reviewed By: ejguan
Differential Revision: D28047914
fbshipit-source-id: 148d54818add44159881d3afc2ffb31bd73bcabf