transformers
[Flash Attention 2] Add flash attention 2 for GPT-J
#28295
Merged

Loading