DeepSpeed
75e579e7 - Average only valid part of the ipg buffer. (#5268)

Commit
1 year ago
Average only valid part of the ipg buffer. (#5268) When contiguous gradients is used ipg buffer may not be fully utilized. Call average_tensor only for the slice with valid gradints Change-Id: I760559d52c2f91e15cd6cd0b48e534ec2352802a Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Author
Parents
Loading