transformers
af1a10bf - [Flax] Return Attention from BERT, ELECTRA, RoBERTa and GPT2 (#11918)

Commit

4 years ago

[Flax] Return Attention from BERT, ELECTRA, RoBERTa and GPT2 (#11918) * Added logic to return attention from flax-bert model and added test cases to check that * Added new line at the end of file to test_modeling_flax_common.py * fixing code style * Fixing Roberta and Elextra models too from cpoying bert * Added temporary hack to not run test_attention_outputs for FlaxGPT2 * Returning attention weights from GPT2 and changed the tests accordingly. * last fixes * bump flax dependency Co-authored-by: jayendra <jayendra@infocusp.in> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

References

#11918 - [Flax] Return Attention from BERT, ELECTRA, RoBERTa and GPT2

Author

jayendra13

Parents

e1205e47

transformers af1a10bf - [Flax] Return Attention from BERT, ELECTRA, RoBERTa and GPT2 (#11918)

transformers
af1a10bf - [Flax] Return Attention from BERT, ELECTRA, RoBERTa and GPT2 (#11918)