Samyamr/grad acc stage2 (#338)

Commit

5 years ago

Samyamr/grad acc stage2 (#338) * Adding gradient accumulation support for ZeRO Stage 2. Changing all Megatron-LM tests to also test gradient accumulation * Gradient Accumulation support for Stage 2. Model tests added to test the feature * formatting * Update deepspeed_light.py removing comment * Update ds_config_func_bs8_zero1.json reverting this file back. Its not needed for this PR * defining baseline prefix Co-authored-by: Jeff Rasley <jerasley@microsoft.com>

References

#338 - Samyamr/grad acc stage2

Author

samyam

Parents

458c0d92

DeepSpeed 7240abf3 - Samyamr/grad acc stage2 (#338)

DeepSpeed
7240abf3 - Samyamr/grad acc stage2 (#338)