transformers
5a8a4eb1 - Improve BERT-like models performance with better self attention (#9124)

Commit
5 years ago
Improve BERT-like models performance with better self attention (#9124) * Improve BERT-like models attention layers * Apply style * Put back error raising instead of assert * Update template * Fix copies * Apply raising valueerror in MPNet * Restore the copy check for the Intermediate layer in Longformer * Update longformer
Author
Parents
Loading