DeepSpeed
600d280f - Improve padding util for compile (#7355)

Commit

315 days ago

Improve padding util for compile (#7355) This PR improves `pad_tensors` in `deepspeed/compile/util.py`, which pads tensors so that all ranks have tensors with the same shape. Previously, this function only adjusts tensor shapes, but tensor strides could differ across ranks, leading to recompilation on only some ranks. As DeepCompile inserts communication operators in the graph, the communication collective easily gets stuck. To address this issue, this PR replaces the use of `torch.nn.functional.pad` with a new approach that ensures consistent strides and avoids communication issues during distributed operations. Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com>

References

#7355 - Improve padding util for compile

Author

tohtana

Parents

76631215

DeepSpeed 600d280f - Improve padding util for compile (#7355)

DeepSpeed
600d280f - Improve padding util for compile (#7355)