DeepSpeed
37e232fa - fix(zero): detach flat buffer to prevent autograd inplace error on CP… (#7948)

Commit

92 days ago

fix(zero): detach flat buffer to prevent autograd inplace error on CP… (#7948) …U accelerator The on-device flatten path (introduced in #7828) passes nn.Parameter objects with requires_grad=True to torch.cat(), creating a flat buffer with CatBackward0 grad_fn. Later, _unflatten_dense_tensors produces SplitBackward0 views that are assigned to model params. Inplace copy_() on these views during optimizer step raises: RuntimeError: Output 0 of SplitBackward0 is a view and is being modified inplace. This especially affects CPU training where CPU_Accelerator.is_available() returns True and available_memory() returns system RAM, so the on-device path is always taken. Fix: add .detach() to the flattened buffer, matching the implicit detach behavior of the CPU-offload path (param.data.cpu() + .to(device)). Also rename flatten_on_gpu -> flatten_on_accelerator and replace GPU-specific terminology in comments/logs with accelerator-generic equivalents. --------- Signed-off-by: Guokai Ma <guokai.ma@intel.com> Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>

References

#7948 - fix(zero): detach flat buffer to prevent autograd inplace error on CP…

Author

delock

Parents

bf0126b5

DeepSpeed 37e232fa - fix(zero): detach flat buffer to prevent autograd inplace error on CP… (#7948)

DeepSpeed
37e232fa - fix(zero): detach flat buffer to prevent autograd inplace error on CP… (#7948)