Fix attention parity for GPT-2 #8549
add comments
444dd896
Merge branch 'master' of https://github.com/Microsoft/onnxruntime
4996bc45
Use persistent softmax to parity with huggingface
e15852bb
format
9de56ff3
enable persistent softmax for gpt-2 by default
9507fb58
update test
e18af41d
tianleiwu
marked this pull request as draft 4 years ago
fix undirectional mask in cpu
b16fce05
tianleiwu
force pushed
from
c7fdf024
to
b16fce05
4 years ago
tianleiwu
changed the title Use persistent softmax in attention cuda operator for GPT parity Fix attention parity for GPT-2 4 years ago
move reshape remover to post-process
5857e243
clean up header
40d6a291
fix windows build
152e083f
tianleiwu
marked this pull request as ready for review 4 years ago
Use persistent softmax to parity with huggingface
538ef199
format
4e9663ea
enable persistent softmax for gpt-2 by default
7843e84d
update test
5aa53eee
fix undirectional mask in cpu
0ba0c811
clean up header
3dc092da
fix windows build
8e4532c8
clean test
0e39b52d
Merge branch 'tlwu/fix_gpt_attention_cuda_hugginface_parity' of https…
9a0a3c59
wangyems
approved these changes
on 2021-07-30
tianleiwu
merged
330b8e74
into master 4 years ago
tianleiwu
deleted the tlwu/fix_gpt_attention_cuda_hugginface_parity branch 4 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub