DeepSpeed
54c06872 - stage3: efficient compute of scaled_global_grad_norm (#5256)

Comment changes are shownComment changes are hidden
Commit
1 year ago
stage3: efficient compute of scaled_global_grad_norm (#5256) using torch.norm instead of inefficient for loop --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Author
Parents
  • deepspeed/runtime/zero
    • File
      stage3.py