pytorch
bbe18e35 - [ZeroRedundancyOptimizer] Elastic and pytorch compatible checkpoints (#50956)

Commit

3 years ago

[ZeroRedundancyOptimizer] Elastic and pytorch compatible checkpoints (#50956) Summary: - Makes it possible to use non-sharded optimizer checkpoints (as long as the model/param groups are the same, of course) - Makes it possible to save with a given world size, and load with another world size - Use Torch Distributed built-in broadcast object list instead of a ad-hoc version Pull Request resolved: https://github.com/pytorch/pytorch/pull/50956 Reviewed By: malfet Differential Revision: D26113953 Pulled By: blefaudeux fbshipit-source-id: 030bfeee2c34c2d987590d45dc8efe05515f2e5c

Author

blefaudeux

Committer

facebook-github-bot

Parents

a990ff70

pytorch bbe18e35 - [ZeroRedundancyOptimizer] Elastic and pytorch compatible checkpoints (#50956)

pytorch
bbe18e35 - [ZeroRedundancyOptimizer] Elastic and pytorch compatible checkpoints (#50956)