DeepSpeed
17c8be07 - Fix the GPU memory usage of ZeRO-Offload (only update stage_1_and_2.py) (#7309)

Commit
119 days ago
Fix the GPU memory usage of ZeRO-Offload (only update stage_1_and_2.py) (#7309) Signed-off-by: Armin Zhu <mingzhengzhu1998@gmail.com> Fix the memory usage of ZeRO-Offload with stage 1 and 2. Before the fix, the memory usage is about 3x that of params_FP16. This is caused by the H2D data copy is using different data type. Now the GPU memory usage is about 1x params_FP16. And the H2D memory copy needs a 16bit pinned memory buffer.
Author
Parents
Loading