[rpc] various fixes for ProcessGroupAgent (#34943)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34943
Follow up to address Jeremy's and Shen's comments on
https://github.com/pytorch/pytorch/pull/34413:
1) Continue trying even if one `agent->send()` fails when cleaning up dist
autograd ctx
2) Use RAII for lock in process group agent `handleSend`
3) Return bool instead of int in `ProcessGroupAgent::handleRecv` to determine
if the count should be incremented
4) Move recvCounts increment in timed out future processing to be within the
block that ensures the future already doesn't have an error.
ghstack-source-id: 100681746
Test Plan: CI
Differential Revision: D20506065
fbshipit-source-id: 14a2820b3ae7a65edd103f0b333c4bc21e821235