[DataLoader] Make distributed lazily initialized & share seed via PG (#85279)
Fixes #84492 https://github.com/pytorch/data/issues/772
## Changes
- Move the logic of distributed sharding from the constructor of DataLoader to the constructor of DataLoaderIterator. This would prevent the Error caused by lazy distributed process initialization
- Replace distributed store by process group (`gloo`) to share the random seed because `mpi` backend doesn't provide distributed store.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85279
Approved by: https://github.com/NivekT, https://github.com/VitalyFedyunin