Support work.result() to get result tensors for allreduce for Gloo, NCCL backends (#43970)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43970
It is resubmition of #43386
Original commit changeset: 27fbeb161706
ghstack-source-id: 111775070
Test Plan:
Added checks to existing unit test and ran it on gpu devserver.
Verified the test that was failing in original diff also passes: https://app.circleci.com/pipelines/github/pytorch/pytorch/210229/workflows/86bde47b-f2da-48e3-a618-566ae2713102/jobs/7253683
Reviewed By: pritamdamania87
Differential Revision: D23455047
fbshipit-source-id: b8dc4a30b95570d68a482c19131674fff2a3bc7c