[FSDP][optim_state_dict] Copy step tensor so that each parameter has its own step (#96313)
Summary: When parameters are flattening, multiple parameters share the same step. When unflattening the parameters, current implementation still make these parameters share the same step. When this is not wrong, some training infra get confused by sharing tensor storages. This PR fixes the issue.
Test Plan: CI
Reviewed By: awgu
Differential Revision: D43893592
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96313
Approved by: https://github.com/zhaojuanmao