pytorch
a916d649 - [FSDP] Relax `sharded_grad` assert to allow IDLE (#96584)

Commit
1 year ago
[FSDP] Relax `sharded_grad` assert to allow IDLE (#96584) `_use_sharded_grad_views()` can be called when re-registering the original parameters in `load_state_dict()`, in which case the training state is `IDLE`. Previously, I only expected `_use_sharded_grad_views()` to be called in `FORWARD` when the sharded gradient is not in `_saved_grad_shard` or `_cpu_grad`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96584 Approved by: https://github.com/fegin, https://github.com/zhaojuanmao
Author
Committer
Parents
Loading