onnxruntime
[CUDA] Add PackedMultiHeadAttention operator
#16779
Merged

[CUDA] Add PackedMultiHeadAttention operator #16779

tianleiwu merged 26 commits into main from tlwu/packed_mha
tianleiwu
tianleiwu add packed MHA op
d3d0b43a
tianleiwu tianleiwu marked this pull request as draft 2 years ago
tianleiwu fix err message in MHA shape inference
e0467dc6
tianleiwu refactor packed attention
80537acf
tianleiwu register op
de290427
tianleiwu clean unused code in packed attention
3f4e7a0f
tianleiwu remove qkv_hidden_sizes_ from base
1787acbb
github-advanced-security
github-advanced-security commented on 2023-07-20
tianleiwu format
6e393948
tianleiwu expose LaunchTransposeRemovePadding
c1161901
tianleiwu draft kernel
f3a686cf
tianleiwu Add unit test
31681512
tianleiwu fix build
6101f32f
tianleiwu Add test case
694577ab
tianleiwu fix debug code
e4daa1b4
tianleiwu fix typo
63a4395c
tianleiwu test trt, cutlass and unfused separately
e923ff74
github-advanced-security
github-advanced-security commented on 2023-07-26
tianleiwu Merge branch 'main' into tlwu/packed_mha
0a6c54d3
tianleiwu instantiation TrtFusedAttention
689299c8
tianleiwu update doc
736975b4
tianleiwu format
e49e004f
tianleiwu exclude from hipify
f924a187
tianleiwu add more test cases
cdc73698
tianleiwu Merge branch 'main' into tlwu/packed_mha
813aa06d
tianleiwu undo test_data_gen script
5a964887
tianleiwu tianleiwu requested a review from yufenglee yufenglee 2 years ago
tianleiwu tianleiwu requested a review from gh-yewang gh-yewang 2 years ago
tianleiwu tianleiwu marked this pull request as ready for review 2 years ago
tianleiwu test cutlass broadcast relative positional bias
43595e58
tianleiwu Merge branch 'tlwu/packed_mha' of https://github.com/microsoft/onnxru…
f7ee1490
tianleiwu Merge branch 'main' into tlwu/packed_mha
c645cd84
gh-yewang
gh-yewang commented on 2023-07-28
gh-yewang
gh-yewang commented on 2023-07-28
gh-yewang
gh-yewang commented on 2023-07-28
gh-yewang
gh-yewang approved these changes on 2023-07-28
tianleiwu tianleiwu merged 742edec5 into main 2 years ago
tianleiwu tianleiwu deleted the tlwu/packed_mha branch 2 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone