[DCP][DSD] Add AdamW to distributed state dict unit tests (#121774)
Thanks @fegin for removing the fsdp root module check in DCP to unblock test updates. https://github.com/pytorch/pytorch/pull/121544
This PR adds "optimzer_class" as a kwarg for the subtests of the following tests to add AdamW as an option.
- test_fsdp
- test_compiled_fsdp
- test_fsdp2
- test_ddp
- test_fsdp_ddp
- test_cpu_offload_full_state_dict
In addition, we temporarily remove the two _verify_osd_by_load in _test_save_load, as state dict loading seems affect parameters. Creating an issue https://github.com/pytorch/pytorch/issues/121186 to keep track.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121774
Approved by: https://github.com/Skylion007
ghstack dependencies: #121773