flax
1b2754a9
- Split the attention softmax so that the expensive elementwise division happens on an array that's O(N) rather than O(N^2)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
5 years ago
Split the attention softmax so that the expensive elementwise division happens on an array that's O(N) rather than O(N^2) PiperOrigin-RevId: 317793206
References
test_317793206
Author
jekbradbury
Committer
a-googler
Parents
ed42d067
Loading