[DTensor] Only wait on AsyncCollectiveTensor after DTensor-based state dict loading (#119716)
Summary:
This PR serves as a follow-up fix to address numerical correctness concerns identified in PR #118197, and we should only wait on `AsyncCollectiveTensor`.
Without the change, we occasionally ran into exception: `AttributeError("'Tensor' object has no attribute 'wait'")`
Test Plan:
**CI**:
Wait for the CI test
**Test with prod model**:
- Tested with models and no-longer ran into the exception after checkpoint loading.
Differential Revision: D53680406
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119716
Approved by: https://github.com/fegin, https://github.com/Skylion007, https://github.com/wz337