Directly use work.result() to retrieve tensor rather than passing as a separate argument (#44914)
Summary:
We currently are fetching an allreduced tensor from Python in C++ in, where we are storing the resulting tensor in a struct's parameter. This PR removes extra tensor paratemeter in the function parameter and fetch from a single place.
Fixes https://github.com/pytorch/pytorch/issues/43960
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44914
Reviewed By: rohan-varma
Differential Revision: D23798888
Pulled By: bugra
fbshipit-source-id: ad1b8c31c15e3758a57b17218bbb9dc1f61f1577