DeepSpeed
56fed13a - Fix: UnboundLocalError for variable 'dim' about issue (#7449)

Commit

151 days ago

Fix: UnboundLocalError for variable 'dim' about issue (#7449) ## Fix `UnboundLocalError` in `ZeroLinear.backward()` when training only bias parameters, as mentioned in #7435 This PR addresses an issue in the `ZeroLinear.backward()` method, where the local variable `dim` could be referenced before assignment. This happens specifically when: - Only the bias parameters are set to `requires_grad=True`, and - The training setup uses **ZeRO Stage 3**, **AMP**, and **gradient checkpointing**. ### Problem When only the bias requires gradients, the condition for setting `dim = grad_output.dim()` is skipped, but the value of `dim` is still used later in the computation, leading to: ### Fix Move the assignment `dim = grad_output.dim()` to occur unconditionally, so that `dim` is always defined before being used in any branch of the gradient computation logic. ### Impact This makes the backward pass more robust across different training setups. Signed-off-by: weeknan <zhounan0431@163.com> Co-authored-by: Olatunji Ruwase <tjruwase@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>

References

#7449 - Fix: UnboundLocalError for variable 'dim' about issue

Author

weeknan

Parents

b8668fb9

DeepSpeed 56fed13a - Fix: UnboundLocalError for variable 'dim' about issue (#7449)

DeepSpeed
56fed13a - Fix: UnboundLocalError for variable 'dim' about issue (#7449)