DeepSpeed
600d280f - Improve padding util for compile (#7355)

Commit
315 days ago
Improve padding util for compile (#7355) This PR improves `pad_tensors` in `deepspeed/compile/util.py`, which pads tensors so that all ranks have tensors with the same shape. Previously, this function only adjusts tensor shapes, but tensor strides could differ across ranks, leading to recompilation on only some ranks. As DeepCompile inserts communication operators in the graph, the communication collective easily gets stuck. To address this issue, this PR replaces the use of `torch.nn.functional.pad` with a new approach that ensures consistent strides and avoids communication issues during distributed operations. Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com>
Author
Parents
Loading