transformers
5a8a4eb1 - Improve BERT-like models performance with better self attention (#9124)

Commit

5 years ago

Improve BERT-like models performance with better self attention (#9124) * Improve BERT-like models attention layers * Apply style * Put back error raising instead of assert * Update template * Fix copies * Apply raising valueerror in MPNet * Restore the copy check for the Intermediate layer in Longformer * Update longformer

References

#9124 - Improve BERT-like models performance with better self attention

Author

jplu

Parents

6b034309

transformers 5a8a4eb1 - Improve BERT-like models performance with better self attention (#9124)

transformers
5a8a4eb1 - Improve BERT-like models performance with better self attention (#9124)