pytorch
12621c3a - support pure fp16 training in FSDP (#68417)

Commit

3 years ago

support pure fp16 training in FSDP (#68417) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68417 1. since parameter attributes are lazily initialized at the beginning of forward, it makes more sense to init full_param_padded using parameters' data type during lazy_init, instead of using parameters' data type during construction, as parameters' data type may be changed after construction and before training loop 2.add checking whether parameter storage is changed outside FSDP and handle it properly ghstack-source-id: 144479019 Test Plan: unit tests Reviewed By: rohan-varma Differential Revision: D32458643 fbshipit-source-id: 0e07e5e08270f2e265e8f49124a6648641e42e7a

References

#69450 - Merge from master

Author

zhaojuanmao

Committer

facebook-github-bot

Parents

41d35dc2

pytorch 12621c3a - support pure fp16 training in FSDP (#68417)

pytorch
12621c3a - support pure fp16 training in FSDP (#68417)