PR #8549 Fix attention parity for GPT-2

Fix attention parity for GPT-2 #8549

tianleiwu merged 19 commits into master from tlwu/fix_gpt_attention_cuda_hugginface_parity

add comments

444dd896

Merge branch 'master' of https://github.com/Microsoft/onnxruntime

4996bc45

Use persistent softmax to parity with huggingface

e15852bb

format

9de56ff3

enable persistent softmax for gpt-2 by default

9507fb58

update test

e18af41d

tianleiwu requested a review 4 years ago

tianleiwu marked this pull request as draft 4 years ago

fix undirectional mask in cpu

b16fce05

tianleiwu force pushed from c7fdf024 to b16fce05 4 years ago

tianleiwu changed the title ~~Use persistent softmax in attention cuda operator for GPT parity~~ Fix attention parity for GPT-2 4 years ago

move reshape remover to post-process

5857e243

clean up header

40d6a291

fix windows build

152e083f

tianleiwu marked this pull request as ready for review 4 years ago

tianleiwu requested a review from

yufenglee 4 years ago

tianleiwu requested a review from

gh-yewang 4 years ago

Use persistent softmax to parity with huggingface

538ef199

format

4e9663ea

enable persistent softmax for gpt-2 by default

7843e84d

update test

5aa53eee

fix undirectional mask in cpu

0ba0c811

clean up header

3dc092da

fix windows build

8e4532c8

clean test

0e39b52d

Merge branch 'tlwu/fix_gpt_attention_cuda_hugginface_parity' of https…

9a0a3c59

gh-yewang commented on 2021-07-30

gh-yewang approved these changes on 2021-07-30

tianleiwu merged 330b8e74 into master 4 years ago

tianleiwu deleted the tlwu/fix_gpt_attention_cuda_hugginface_parity branch 4 years ago

Reviewers

gh-yewang

yufenglee

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

onnxruntime Fix attention parity for GPT-2 #8549 Merged

Fix attention parity for GPT-2 #8549

onnxruntime
Fix attention parity for GPT-2
#8549

Merged