Much more efficient and clear weight initialization and tie weights #42191
Cyrilvallez
changed the title Much more efficient and clear weight initialization Much more efficient and clear weight initialization and tie weights 69 days ago
everything untilo informer
da26896e
everything until perceiver
d561c6f9
all of them finally
ceea3058
style
187bb8ef
replace by transformers init everywhere
2cd2addc
use relative import instead
6bdffed5
deprecated models
d25fe728
style
82899acb
start contexts
a4ab5985
small fixes
192151e0
fix modular
5efa9a8b
remove class switch
c882d608
do not initialize tied weights
22a55a36
typo
694440bb
fix
5a0174ec
improve
5423e064
improve comments
9b7ace53
improve
4acef54a
improve
c58d243c
fix zamba
2edc8c17
fix import
2f40139d
add the post_init
2dd4e00a
more post_init
3ede2872
Cyrilvallez
force pushed
from
f33c91ec
to
3ede2872
69 days ago
fix
86f7169d
protect
706799e9
more post_init
1da2d273
fix
83e0ada2
fixes
50187a90
fix
16173f06
fix
bae372ae
switch flag name
8500bcf9
more fixes
cdada869
fixes
99961fc6
fixes
557ef759
Cyrilvallez
force pushed
from
79e84f90
to
557ef759
69 days ago
Merge branch 'main' into better-init-2
2dd08170
copies
912440bc
fix
acdaf9e9
finally find the culprit
cc10ea4e
style
627e77b3
last small
db42923c
big bird
17115a22
better
bbdc5a5b
update init check
3a12aec8
final touch
9beb88c0
do it everywhere
60928045
Cyrilvallez
deleted the better-init-2 branch 68 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub