transformers
don't initialize the output embeddings if we're going to tie them to input embeddings
#28192
Merged

Loading