[`GPT-NeoX`] Add SDPA support (#31031)

Commit

1 year ago

[`GPT-NeoX`] Add SDPA support (#31031) * starting support for sdpa in `gptneox` models * small comment on tests * fix dropout * documentation and style * clarify concrete paths for reference * generalise attn projections and rope application added head mask check to sdpa mask creation handle sdpa memory backend bug via own version flag * update docs and style * move dtype casting outside of general attn_projection_and_rope function fix flash_attn_2 stuff * more generic attn warning if output_attns or head_mask * simplify head mask check by moving head mask creation to a later point * remove copied llama artifact * remove padding_mask from attention function signature * removing unnecessary comments, only "save" attn implementation once * [run_slow] gpt_neox

References

#31031 - [`GPT-NeoX`] Add SDPA support

Author

vasqu

Parents

1218e439

transformers b07770c5 - [`GPT-NeoX`] Add SDPA support (#31031)

transformers
b07770c5 - [`GPT-NeoX`] Add SDPA support (#31031)