DeepSpeed
d229ff17 - Zero3 Fix allreduce optimization for extra large tensor (#3832)

Commit

2 years ago

Zero3 Fix allreduce optimization for extra large tensor (#3832) Grad tensors that don't fit in the bucket flat buffer are not added to it, but still added to params_in_ipg_bucket if such tensors exists use reduce_scatter of params_in_ipg_bucket instead of allreduce. since allreduce assumes all grads are in ipg_bucket_flat_buffer. Add test for reduce scatter=false Fix padding to zeros instead of undefined values Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>

References

#3832 - Zero3 Fix allreduce optimization for extra large tensor

Author

BacharL

Parents

807d1b5d

DeepSpeed d229ff17 - Zero3 Fix allreduce optimization for extra large tensor (#3832)

DeepSpeed
d229ff17 - Zero3 Fix allreduce optimization for extra large tensor (#3832)