Fix DDP bug in single process multiple device use cases (#36503)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36503
Test Plan: Imported from OSS
Differential Revision: D21179274
Pulled By: mrshenli
fbshipit-source-id: 0afce30ae0ddda753d1e240584a0f80df9aec4c2