🔴 🚨 Resizing tokens embeddings: initialize from old embeddings' normal distribution. #33325
abuelnasr0
changed the title Resizing tokens embeddings: initialize from old embeddings' normal distribution. 🔴 🔴 Resizing tokens embeddings: initialize from old embeddings' normal distribution. 1 year ago
abuelnasr0
changed the title 🔴 🔴 Resizing tokens embeddings: initialize from old embeddings' normal distribution. 🔴 🚨 Resizing tokens embeddings: initialize from old embeddings' normal distribution. 1 year ago
intilize new embeddings from normal distrib
25c92e1e
Fix typo in comments
a95639c3
Fix typo in comments
d850b995
Fix style
3f445078
Fix variables naming
5ea5f828
Add tests
d1d81d52
Fix style
f3aaf0af
code consistency nit
bdef61af
Add deepspeed support
15a7b5ab
Add deepspeed support
6e40b4f6
Conver embeddings weights to float32 before computations
aba7d8c2
Add deepspeed tests
4f1b0fa5
Cover when vocab_size is smaller than embedding_size
dea8e285
Style fix
84f8cfa7
Add tests for vocab_size smaller than hiddin_size
2923e858
Style fix
188ba1bd
Nits in tests
22ac85c3
Nits in tests
3e42f66e
Check for deepspeed before importing it
226f31c7
Increase vocab_size for positive definite covariance matrix test
cef744fa
Add warning
6583cd5e
Add multivariate_resizing flag and implement resizing for lm_heads
7577cd49
Fix typo
0472bacc
Fix wrong bias indexing
fd4ad000
Fix bias is zero check
6ff2bca9
remove multivariate_resizing flag from tests
12e61c61
Intialize bias from old bias normal distribution
eb80c339
Fixup
ef6bdbc4
Code usability
5cdce5f3
Use mean_resizing instead of multivariate_resizing
f4a9cf46
Fix up
fc436d7e
Fix comments and docs
8e60a368
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub