Add local shutdown to process group agent (#30020)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30020
This is now possible due to previous changes made in `gloo` and `ProcessGroupGloo`. We `abort` the listener thread that is waiting for a message, and join all other threads. The destructor calls this same `localShutdown` method, but we ensure this is not called multiple times.
ghstack-source-id: 94415336
Test Plan: Unit tests pass.
Differential Revision: D5578006
fbshipit-source-id: 6258879fb44c9fca97fdfad64468c1488c16ac02