Fix a convergence issues in TP topology caused by incorrect grad_norm. #5411
fix grad norm for tp
287fa5eb
refine code
a7e8a7fe
remove unnecessary clip_gradients fun
ea41928e
improve perf by loop-free implementations
e74b7ca6
Modify the comments.
79cc4cef
update
3ebed5ea
Merge remote-tracking branch 'master' into tp_grad_fix
fc537b8e
refine comments
df976ca6
tohtana
approved these changes
on 2024-04-16
Merge branch 'master' into tp_grad_fix
a40263f5
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub