DeepSpeed
Fix ZeRO-1/2 CPU-offloaded gradient loss with multiple backward() per step
#7981
Merged

Fix ZeRO-1/2 CPU-offloaded gradient loss with multiple backward() per step #7981

roycho96
roycho96 Fix ZeRO-1/2 CPU-offloaded gradient loss with multiple backward() per…
70e4e69a
roycho96 roycho96 requested a review from tjruwase tjruwase 52 days ago
roycho96 roycho96 requested a review from tohtana tohtana 52 days ago
roycho96 roycho96 requested a review from loadams loadams 52 days ago
delock
delock approved these changes on 2026-04-21
delock
roycho96
roycho96 fix formatting
efd10ee3
roycho96 roycho96 force pushed from 95d73e28 to efd10ee3 51 days ago
delock delock merged aeb10bb1 into master 50 days ago
roycho96 roycho96 deleted the fix/zero2-offload-ga1-multi-backward branch 50 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone