pytorch
a6895022 - [FSDP] Do not include empty state in `_flatten_optim_state_dict()` (#88353)

Commit
2 years ago
[FSDP] Do not include empty state in `_flatten_optim_state_dict()` (#88353) https://github.com/pytorch/pytorch/blob/983c0e7f3101f1543bed6c4ec1539a4d590a94c0/torch/optim/adam.py#L163 The above line requires that a candidate optimizer state dict being loaded via `load_state_dict()` has non-empty state for its 0th parameter (via `state_values[0]`). This PR changes FSDP to only include non-empty mappings in the state returned by `_flatten_optim_state_dict()`, which is the subroutine for both `shard_full_optim_state_dict()` and `flatten_sharded_optim_state_dict()`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88353 Approved by: https://github.com/fegin
Author
Committer
Parents
Loading