[BE] rewrite ProcessGroupNCCLTest to be MultiProcess (#67705)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67705
This PR rewrites ProcessGroupNCCLTest to be MultiProcessTestCase. It was originally written in a single process multi-GPU fashion, we change it to multi-process instead to align with other c10d tests.
ghstack-source-id: 144555092
Test Plan: wait for CI
Reviewed By: pritamdamania87, fduwjj
Differential Revision: D32113626
fbshipit-source-id: 613d36aeae36bf441de1c2c83aa4755f4d33df4d