transformers
5600e6f3 - Hardcode GELU as the intermediate activation for ESM (#22892)

Commit
2 years ago
Hardcode GELU as the intermediate activation for ESM (#22892) * Hardcode GELU as the intermediate activation for ESM * Sneak a quick fix to the weight tying in too * Make the call to gelu explicit
Author
Parents
Loading