SemanticDiff

pytorch
3956ce01 - [FSDP2] Added autograd/memory/overlap/frozen/2D/AC tests (#118136)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

222 days ago

[FSDP2] Added autograd/memory/overlap/frozen/2D/AC tests (#118136) This PR adds tests for autograd (mainly backward hooks), memory, overlap, and frozen parameters. - Autograd: unused forward output, unused forward module, non-tensor activations (common in internal models) - Memory: expected GPU memory usage after init, forward, backward, and optimizer step - Overlap: communication/computation overlap in forward and backward - Frozen: expected reduce-scatter size, training parity This PR adds some initial 2D (FSDP + TP) training and model state dict tests. The only change required for model sharded state dict is to make sure parameters are sharded before save and load. This PR adds tests that `fully_shard` can use `torch.utils.checkpoint`, `_composable.checkpoint`, and `CheckpointWrapper` on a transformer. (I squashed all of these into one PR now to save CI cost.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/118136 Approved by: https://github.com/weifengpy, https://github.com/wanchaol ghstack dependencies: #119550

Author

awgu

awgu

Committer

pytorchmergebot

pytorchmergebot

Parents

FAQ Terms Privacy Refunds Impressum

Loading