transformers
205bc415
- Fix GPT-NeoX-20B past handling, attention computation (#17811)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
3 years ago
Fix GPT-NeoX-20B past handling, attention computation (#17811) * Fix GPT-NeoX-20B past handling, swap attention computation to hopefully avoid NaN, update docs * 20B tests
References
#17811 - Fix GPT-NeoX-20B past handling, attention computation
Author
zphang
Parents
692e61e9
Loading