SemanticDiff

pytorch
b2dddcfe - [FSDP2][DCP][DSD] Add test to ensure FSDP2 model/optim state_dict work after a full training loop (#120871)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

204 days ago

[FSDP2][DCP][DSD] Add test to ensure FSDP2 model/optim state_dict work after a full training loop (#120871) This PR adds tests to test distributed state dict work properly for FSDP2's model and optimizer state_dict after a full training loop. We test the combination of these options on a evenly sharded model. ``` { "reshard_after_forward": [True, False], "optimizer_class": [torch.optim.Adam], "compile_model": [True, False], }, ``` Followup: 1. Add test for unevenly sharded model. 2. Add test to include `torch.optim.AdamW` (seems to have some gaps currently, still investigating) Pull Request resolved: https://github.com/pytorch/pytorch/pull/120871 Approved by: https://github.com/fegin

Author

wz337

wz337

Committer

pytorchmergebot

pytorchmergebot

Parents

FAQ Terms Privacy Refunds Impressum

Loading