SemanticDiff

pytorch
6b6702f5 - Enhance `no_grad`-context FSDP backward handling (#105374)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

1 year ago

Enhance `no_grad`-context FSDP backward handling (#105374) Fixes #105369 Fixes #105371 Addressing two somewhat distinct issues that involve the same test in this PR: 1. To fix #105369: - Add a `no_grad` guard to [`_register_post_backward_reshard_only_hooks`](https://github.com/pytorch/pytorch/blob/93f852f201b93ca0c41b5cd861834d4f1f235ef7/torch/distributed/fsdp/_runtime_utils.py#L1406) to avoid registering post-backward hooks that would not be removed in that context. 2. To fix #105371: - Add a `grad` context condition to [`_use_sharded_flat_param`](https://github.com/pytorch/pytorch/blob/93f852f201b93ca0c41b5cd861834d4f1f235ef7/torch/distributed/fsdp/flat_param.py#L1645C9-L1645C32) logic to trigger post-forward `_use_sharded_views` in a `no_grad` context for `NO_RESHARD_AFTER_FORWARD_HANDLE_STRATEGIES` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105374 Approved by: https://github.com/awgu

Author

speediedan

speediedan

Committer

pytorchmergebot

pytorchmergebot

Parents

FAQ Terms Privacy Refunds Impressum

Loading