fix overflow when training mDeberta in fp16 #24116
Porting changes from https://github.com/microsoft/DeBERTa/ that hopef…
04651857
Updates to deberta modeling from microsoft repo
b367defb
Performing some cleanup
9c056f23
Undoing changes that weren't necessary
ba8f2ade
Undoing float calls
3492dd70
Minimally change the p2c block
b75fbd8a
Fix error
b5b697ab
Minimally changing the c2p block
6d69c7fb
Switch to torch sqrt
0ea34591
Remove math
3c95c8a1
Adding back the to calls to scale
dd8bd345
Undoing attention_scores change
b930014d
Removing commented out code
9d22fde5
Updating modeling_sew_d.py to satisfy utils/check_copies.py
f9d52efd
Missed changed
c90cde89
Further reduce changes needed to get fp16 working
9969b99d
Reverting changes to modeling_sew_d.py
1ec5df7b
Make same change in TF
0c9ca818
sjrl
deleted the mdeberta-fp16-overflow branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub