transformers
Fix GPT-NeoX-20B past handling, attention computation
#17811
Merged

Loading