Fix process hang in process-group shutdown (#7941)
Removing the file used as the file-store while the process-group is
still active is invalid as it is still in use.
If `reuse_dist_env` is `True` the process group is still active and the
processes will try reading from that file waiting for it to exists. In
the shutdown (`destroy_process_group`) they will wait for all threads to
join but (at least) one is still waiting for that file. This will cause
the process to hang until a PyTorch-internal timeout is reached, which
currently is ~ 5minutes
Solution is to create a unique file. I chose to put it in in `tmpdir`
and add a suffix to differentiate it.
Note that `tmpdir` is not enough as this method is called through the
fixture setup already once so that is not clean when called later in the
test execution
CC @mrwyattii , author of #3850 adding this code
---------
Signed-off-by: Alexander Grund <alexander.grund@tu-dresden.de>