DeepSpeed
2ffaecbb - Fix mis-aligned-grad

Commit
4 years ago
Fix mis-aligned-grad When a parameter is not divisible by world size, the partitioned gradients are mis-aligned due to incorrect padding handling. This PR should fix for that.
Author
Parents
Loading