Fix race during RPC shutdown. (#36113)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36113
As part of debugging https://github.com/pytorch/pytorch/issues/35863,
I discovered that the unit test would timeout during clean shutdown.
Looking into this further, it looks like there is a race in
`_on_leader_follower_report_shutdown_intent` when multiple followers call the
same method on the leader.
To fix this, I've ensured we have an appropriate lock in
`_on_leader_follower_report_shutdown_intent` to guard against this.
I ran the test 500 times to validate that this fix works.
Closes #35863
ghstack-source-id: 101641463
Test Plan:
1) waitforbuildbot
2) Ran the test 500 times.
Differential Revision: D20884373
fbshipit-source-id: 9d580e9892adffc0c9a4c2e832881fb291a1ff16