onnxruntime
330b8e74 - Fix attention parity for GPT-2 (#8549)

Commit
4 years ago
Fix attention parity for GPT-2 (#8549) * Use persistent softmax to parity with huggingface * fix undirectional mask logic * add test
Author
Parents
Loading