DeepSpeed
18489835 - fix gradient accumulation for z2+offload

Commit
1 year ago
fix gradient accumulation for z2+offload
Author
Masahiro Tanaka
Parents
Loading