onnxruntime
Add GQA support for ROCm
#21032
Merged

Add GQA support for ROCm #21032

cloudhan merged 18 commits into main from guangyunhan/rocm-gqa
cloudhan
cloudhan feat: init rocm gqa
c877adcb
cloudhan feat: extend strided copy to support runtime tok idx
f845099d
cloudhan more case
0ea33352
cloudhan feat: local
99b2feb6
cloudhan feat: rotary
816249c8
cloudhan feat: allow rotary to read and write in a strided way, so that we don…
6024dc96
cloudhan fix: rotary for packed qkv
48092eee
cloudhan remove debug print
de2f30ae
github-advanced-security
github-advanced-security commented on 2024-06-13
cloudhan cloudhan force pushed from d33e0b74 to 19970a75 1 year ago
cloudhan cloudhan force pushed from 19970a75 to 23e20bc6 1 year ago
cloudhan cloudhan force pushed from 23e20bc6 to 6ccd1d7c 1 year ago
cloudhan cloudhan force pushed from 6ccd1d7c to b6be9bde 1 year ago
cloudhan cloudhan force pushed from b6be9bde to 14d1a1ab 1 year ago
cloudhan workaround: add flash_attn test to ci
6c4e6125
cloudhan add gpu arch checking warning log
e9f6d13f
cloudhan fix: build without ck tile
2b0c46ed
cloudhan cloudhan force pushed from 14d1a1ab to 2b0c46ed 1 year ago
cloudhan
cloudhan test: update ci pytorch and triton version to fix tests which have fa…
6091a69b
cloudhan format
8ca06340
cloudhan remove unused param is_input_bnsh_format from strided version LaunchR…
e22dfb9e
cloudhan cloudhan marked this pull request as ready for review 1 year ago
cloudhan cloudhan requested a review 1 year ago
cloudhan cloudhan requested a review from tianleiwu tianleiwu 1 year ago
cloudhan cloudhan requested a review from yufenglee yufenglee 1 year ago
cloudhan make onnxruntime_USE_COMPOSABLE_KERNEL_CK_TILE depends on onnxruntime…
789fee73
cloudhan skip test_flash_attn_rocm on CUDA platform
b973217f
github-advanced-security
github-advanced-security commented on 2024-07-01
tianleiwu
cloudhan fix lint and ci
c3c7089d
cloudhan fix typo
f4355d46
cloudhan cloudhan force pushed from 1384ff45 to f4355d46 1 year ago
tianleiwu
tianleiwu approved these changes on 2024-07-02
cloudhan
mszhanyi
mszhanyi approved these changes on 2024-07-03
cloudhan cloudhan merged f39ee14b into main 1 year ago
cloudhan cloudhan deleted the guangyunhan/rocm-gqa branch 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone