flax
8a090e3e - Allow Adafactor to not update certain parameters proportional

Commit
4 years ago
Allow Adafactor to not update certain parameters proportional to their scale. This is a potential cause of bad quality when the relative positional embeddings are updated proportional to their scale in large LMs. PiperOrigin-RevId: 368110589
References
Author
Committer
Parents
Loading