transformers
e02037b3 - Fix bug in gpt2's (from-scratch) special scaled weight initialization (#17877)

Commit
3 years ago
Fix bug in gpt2's (from-scratch) special scaled weight initialization (#17877) * only special scale init each gpt2 c_proj weight once, on exact match * fix double quotes Co-authored-by: leandro <leandro.vonwerra@spoud.io>
Author
Parents
Loading