replace all_gather with more efficient collective api _all_gather_base (#57769)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57769
_all_gather_base saved copies in all_gather, so it is more efficient
Test Plan: unit test
Reviewed By: SciPioneer
Differential Revision: D28227193
fbshipit-source-id: ddd8590095a5b45676497a71ed792a457f9825c6