SemanticDiff

pytorch
b9469170 - [FSDP] Avoided CPU sync in `clip_grad_norm_` (#122001)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

187 days ago

[FSDP] Avoided CPU sync in `clip_grad_norm_` (#122001) Copying a scalar 0 tensor on CPU to GPU or constructing a scalar 0 tensor on GPU requires a CPU sync with the GPU. Let us avoid doing ops that involve it. `FSDP.clip_grad_norm_` already first checks if all parameters are not sharded and calls into `nn.utils.clip_grad_norm_`, so at the point of the code changes, there is guaranteed to be some sharded parameters. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122001 Approved by: https://github.com/wanchaol

Author

awgu

awgu

Committer

pytorchmergebot

pytorchmergebot

Parents

FAQ Terms Privacy Refunds Impressum

Loading