onnxruntime
Fuse attention node even in case of different Q,K hidden dimensions
#8106

Merged

Fuse attention node even in case of different Q,K hidden dimensions #8106

viboga merged 25 commits into master from Vish/opt_attn_qkv_update

changes to fuse attention node and create varied dimensions

75583f73

added an option to optimizer to only do offline fusion

0f83a68b

fixing a typo

59294d3a

merge with master

e98eaf03

Merge remote-tracking branch 'origin/master' into Vish/optimizer_attn…

5c009f19

viboga requested a review 4 years ago

removing extra changes

79f1dea2

viboga marked this pull request as draft 4 years ago

viboga changed the title ~~Vish/opt attn qkv update~~ Fuse attention node even in case of different Q,K hidden dimensions 4 years ago

added new unit test - test_attention_fusion_for_varied_qkv_dimensions()

9d77b9c8

Unit test succesfull for q,k,v paths with varied dimensions

02a4c482

adding test model for unit test case

370b6120

optimizing attention tests

e4b6b23e

removing debugs

5cec7c25

viboga assigned

wangyems 4 years ago

viboga assigned

tianleiwu 4 years ago

viboga marked this pull request as ready for review 4 years ago

wangyems commented on 2021-06-21

minor change

4e52c0db

wangyems requested a review from

wangyems 4 years ago

wangyems dismissed these changes on 2021-06-21

addressing comments

9f5159bb

viboga dismissed their stale review via 9f5159bb 4 years ago

tianleiwu commented on 2021-06-22

addressing comments

4de23c01

changed the new option to disable_onnxruntime

2d89fe63

tianleiwu commented on 2021-06-22

replacing asserts with debugs

48c9bc9a

make attn fusion backward compatible for head_size, hidden_size

b7145ece

preserving behavior for shape_modified_tensor

68a8dd16

adding new option as the last parameter

f76c039f

cleaning up

63752239

line breaks and spaces

b1ad048d

formatting according to python

b4c5ed3e

viboga requested a review from

tianleiwu 4 years ago

viboga requested a review from

wangyems 4 years ago

making the changes to fuse attention node without user input

5221a743

changes to fusion_attention.py updated

34deae75

bringing the code up to python standard

8e924ced

tianleiwu approved these changes on 2021-06-24

viboga merged b478086b into master 4 years ago

viboga deleted the Vish/opt_attn_qkv_update branch 4 years ago

Reviewers

tianleiwu

wangyems

Assignees

tianleiwu

wangyems

Labels

None yet

Milestone

No milestone

onnxruntime Fuse attention node even in case of different Q,K hidden dimensions #8106 Merged

Fuse attention node even in case of different Q,K hidden dimensions #8106

onnxruntime
Fuse attention node even in case of different Q,K hidden dimensions
#8106

Merged