Fix: UnboundLocalError for variable 'dim' about issue (#7449)
## Fix `UnboundLocalError` in `ZeroLinear.backward()` when training only
bias parameters, as mentioned in #7435
This PR addresses an issue in the `ZeroLinear.backward()` method, where
the local variable `dim` could be referenced before assignment. This
happens specifically when:
- Only the bias parameters are set to `requires_grad=True`, and
- The training setup uses **ZeRO Stage 3**, **AMP**, and **gradient
checkpointing**.
### Problem
When only the bias requires gradients, the condition for setting `dim =
grad_output.dim()` is skipped, but the value of `dim` is still used
later in the computation, leading to:
### Fix
Move the assignment `dim = grad_output.dim()` to occur unconditionally,
so that `dim` is always defined before being used in any branch of the
gradient computation logic.
### Impact
This makes the backward pass more robust across different training
setups.
Signed-off-by: weeknan <zhounan0431@163.com>
Co-authored-by: Olatunji Ruwase <tjruwase@gmail.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>