DeepSpeed
d568375e - [Engine] Only scale gradients if scale_wrt_gas is True (#7724)

Commit
15 days ago
[Engine] Only scale gradients if scale_wrt_gas is True (#7724) `_backward_prologue_per_tensor` checks if `scale_wrt_gas` is True and only scales if so. For https://github.com/huggingface/accelerate/issues/3877 --------- Signed-off-by: Kashif Rasul <kashif.rasul@gmail.com> Signed-off-by: Masahiro Tanaka <mtanaka@anyscale.com> Co-authored-by: Masahiro Tanaka <mtanaka@anyscale.com>
Author
Parents
Loading