pytorch
15e58c19 - [FSDP][optim_state_dict] Copy step tensor so that each parameter has its own step (#96313)

Commit

1 year ago

[FSDP][optim_state_dict] Copy step tensor so that each parameter has its own step (#96313) Summary: When parameters are flattening, multiple parameters share the same step. When unflattening the parameters, current implementation still make these parameters share the same step. When this is not wrong, some training infra get confused by sharing tensor storages. This PR fixes the issue. Test Plan: CI Reviewed By: awgu Differential Revision: D43893592 Pull Request resolved: https://github.com/pytorch/pytorch/pull/96313 Approved by: https://github.com/zhaojuanmao

Author

fegin

Committer

pytorchmergebot

Parents

cdab1d67

pytorch 15e58c19 - [FSDP][optim_state_dict] Copy step tensor so that each parameter has its own step (#96313)

pytorch
15e58c19 - [FSDP][optim_state_dict] Copy step tensor so that each parameter has its own step (#96313)