DeepSpeed
3c0bd312 - BF16 optimizer: Improve device utilization by immediate grad update (#4975)

Commit
1 year ago
BF16 optimizer: Improve device utilization by immediate grad update (#4975) Enabled gradient accumulation in bf16 optimizer which updates fp32 gradients once they are available. This improves device utilization on some back-ends, by parallelizing the workload across engines. To enable the feature (disabled by default), use a new config flag "immediate_grad_update" under "bf16" section in Deepspeed config.json (default is false). Example: "bf16": { "enabled": true, "immediate_grad_update": true } --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Author
Parents
Loading