Test different layer norm #270

thomasw21 wants to merge 27 commits into main from thomas/test_different_layer_norm
thomasw21
thomasw21 WIP
8d7a6038
thomasw21 Wip
240f673e
thomasw21 Woops
1cdcd7de
thomasw21 WIP
29372806
thomasw21 Woops
7fcff06b
thomasw21 Woops
1f2f8007
thomasw21 Woops
f152e487
thomasw21 Test with alibi
ce02dd16
thomasw21 Still trying to reproduce
02365d14
thomasw21 Huh
42d6b4e3
thomasw21 Have high LR to see weights actually change
c20c8ba4
thomasw21 Launch bf16
7f2441ed
thomasw21 Woops
a4172bf9
thomasw21 Make test to work with both bf16 and fp16 to see who fails
5fbe1072
thomasw21 Woops
a0c09132
thomasw21 Remove assert
6b19339c
thomasw21 Try to figure out how the divergence happens
a5e32958
thomasw21 I think bias starts to diverge first
7145f6df
thomasw21 Woops
311e5317
thomasw21 Woops
39d4b8f9
thomasw21 Woops
8ffb278f
thomasw21 Add embed layer norm
2389bfdf
thomasw21 Woops
0cf35ee3
thomasw21 Backward compatibility on torch
f0d6d179
thomasw21 Better
07ccb3db
stas00 Merge remote-tracking branch 'origin/main' into thomas/test_different…
3c5e4914
stas00 fix
a5b5edc0

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone