[FSDP2] Added 2D DCP save/load test (#121747)
To prepare for FSDP2 + TP/SP in torchtrain, we should verify that we can resume training correctly with DCP save/load. For loading into a new model/optimizer instance, torchtrain uses lightweight `ModelWrapper` and `OptimizerWrapper`. In the added unit test, we use `get_optimizer_state_dict` directly to show the minimal requirement for correctness.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121747
Approved by: https://github.com/wz337