[Dist Profiling] Enable dist profiling for DDP (gloo only) (#52031)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52031
Closes https://github.com/pytorch/pytorch/issues/52020
Ensures that we can profile collectives in DDP by propagating the profiler threadLocalState appropriately. As described in the above issue, before this wouldn't work as the profiler would only be enabled on the main thread.
ghstack-source-id: 121818080
Test Plan: CI
Reviewed By: zhaojuanmao
Differential Revision: D26356192
fbshipit-source-id: 0158b5833a3f857a0b4b2943ae3037e9d998dfd1