DeepSpeed
0f0e38c5 - fixes #2498 (#2603)

Commit
2 years ago
fixes #2498 (#2603) taking gradient accumulation steps into account for throughput calculation Co-authored-by: Alexander Jipa <azzhipa@amazon.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Author
Alexander Jipa
Parents
Loading