pytorch
15e58c19 - [FSDP][optim_state_dict] Copy step tensor so that each parameter has its own step (#96313)

Commit
1 year ago
[FSDP][optim_state_dict] Copy step tensor so that each parameter has its own step (#96313) Summary: When parameters are flattening, multiple parameters share the same step. When unflattening the parameters, current implementation still make these parameters share the same step. When this is not wrong, some training infra get confused by sharing tensor storages. This PR fixes the issue. Test Plan: CI Reviewed By: awgu Differential Revision: D43893592 Pull Request resolved: https://github.com/pytorch/pytorch/pull/96313 Approved by: https://github.com/zhaojuanmao
Author
Committer
Parents
Loading