[FSDP/Checkpoint] Activation offload support in checkpoint_wrapper (#70165)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70165
Implements activation offload support in checkpoint_wrapper API via
save_on_cpu hooks. We avoid modifying the torch.utils.checkpoint implementation
and instead compose offload + checkpoint using the save_on_cpu hook for the
former.
ghstack-source-id: 146078900
Test Plan: CI
Reviewed By: zhaojuanmao
Differential Revision: D33228820
fbshipit-source-id: 98b4da0828462c41c381689ee07360ad014e808a