DeepSpeed
b1cb0dfc - Guanhua/partial offload rebase v2 (#590) (#4636)

Commit
1 year ago
Guanhua/partial offload rebase v2 (#590) (#4636) This PR introduces Twin-Flow feature of ZeRO-Offload++, which improves e2e training iteration time by up to 6x on DGX-H100s. This PR includes: * Twin-Flow implementation inside ZeRO optimizer * json config tutorial * example using deepspeed * unit tests cc @jeffra @awan-10 @tjruwase @mrwyattii Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Author
Parents
Loading