onnxruntime
Add Continuous Decoding support in GQA
#21523
Merged

Add Continuous Decoding support in GQA #21523

aciddelgado merged 28 commits into main from aciddelgado/gqa_interactive
aciddelgado
aciddelgado gqa supports interactive
ee47ba45
aciddelgado aciddelgado requested a review from tianleiwu tianleiwu 1 year ago
aciddelgado aciddelgado requested a review from yufenglee yufenglee 1 year ago
github-advanced-security
github-advanced-security commented on 2024-07-26
github-advanced-security
github-advanced-security commented on 2024-07-26
aciddelgado lint, clang, clean-up manually
d938816d
github-advanced-security
github-advanced-security commented on 2024-07-26
aciddelgado Merge branch 'main' into aciddelgado/gqa_interactive
01cd33b5
aciddelgado new idea, seqlense_q
fdc84b4d
aciddelgado cpu update
1cddf5f3
aciddelgado cpu almost works but segfaults on non-interactive prompt but we gotta…
8e3483e5
github-advanced-security
github-advanced-security commented on 2024-07-31
aciddelgado single batch implementation unclean
60fe746d
github-advanced-security
github-advanced-security commented on 2024-08-06
aciddelgado clean up code
dd3c4a66
aciddelgado clang lint
3565dc2f
aciddelgado changes
bd83af79
aciddelgado trigger pipelines and whatnot
aaa98665
aciddelgado merge main
d4e72f85
aciddelgado pipeline
d8f37c00
aciddelgado pipelines
548ab9be
aciddelgado merge main
83588191
aciddelgado minor rotary test change
5ff6050c
aciddelgado pls
255fa1c6
yufenglee
yufenglee commented on 2024-09-09
yufenglee
yufenglee commented on 2024-09-09
aciddelgado fixes
11c4a0e3
aciddelgado aciddelgado marked this pull request as ready for review 1 year ago
yufenglee
yufenglee commented on 2024-09-09
yufenglee
yufenglee commented on 2024-09-09
yufenglee
yufenglee commented on 2024-09-10
tianleiwu
tianleiwu commented on 2024-09-10
yufenglee
yufenglee commented on 2024-09-10
yufenglee
yufenglee commented on 2024-09-10
yufenglee
yufenglee commented on 2024-09-10
yufenglee
yufenglee commented on 2024-09-10
yufenglee
yufenglee commented on 2024-09-10
yufenglee
yufenglee commented on 2024-09-10
aciddelgado docs
4a86c55e
aciddelgado aciddelgado changed the title Add Interactive Decoding support in GQA Add Continuous Decoding support in GQA 1 year ago
aciddelgado address comments
0be4962b
aciddelgado docs
c0cb4c5e
yufenglee
yufenglee commented on 2024-09-11
yufenglee
yufenglee dismissed these changes on 2024-09-11
tianleiwu
tianleiwu commented on 2024-09-11
tianleiwu
tianleiwu commented on 2024-09-11
tianleiwu
tianleiwu commented on 2024-09-11
tianleiwu
tianleiwu commented on 2024-09-11
tianleiwu
aciddelgado comments
019b058a
aciddelgado aciddelgado dismissed their stale review via 019b058a 1 year ago
aciddelgado remove cuda helper
e46ac2d7
tianleiwu
aciddelgado rocm
a9e4c768
tianleiwu
aciddelgado lint
d92e0f40
aciddelgado docs
f089af33
tianleiwu
tianleiwu commented on 2024-09-12
aciddelgado description
5e8d419b
tianleiwu
tianleiwu commented on 2024-09-13
aciddelgado docs
36ca4d14
tianleiwu
tianleiwu approved these changes on 2024-09-13
aciddelgado aciddelgado merged 7e2c7224 into main 1 year ago
aciddelgado aciddelgado deleted the aciddelgado/gqa_interactive branch 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone