transformers
Fix bug in gpt2's (from-scratch) special scaled weight initialization
#17877
Merged

Fix bug in gpt2's (from-scratch) special scaled weight initialization #17877

karpathy
karpathy only special scale init each gpt2 c_proj weight once, on exact match
0637d69e
HuggingFaceDocBuilderDev
fix double quotes
0d9b891d
lvwerra
lvwerra commented on 2022-06-25
sgugger
sgugger approved these changes on 2022-06-25
sgugger sgugger merged e02037b3 into main 3 years ago
siddk

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone