onnxruntime
GQA Memory Efficient Kernel
#17920
Merged

GQA Memory Efficient Kernel #17920

tianleiwu merged 20 commits into main from aciddelgado/gqa_memeff_v2
aciddelgado
aciddelgado squash merge
ef82d4d3
aciddelgado aciddelgado requested a review from tianleiwu tianleiwu 2 years ago
aciddelgado aciddelgado requested a review from yufenglee yufenglee 2 years ago
github-advanced-security
github-advanced-security commented on 2023-10-12
aciddelgado aciddelgado changed the title squash merge GQA Memory Efficient Kernel 2 years ago
tianleiwu
tianleiwu commented on 2023-10-12
tianleiwu
tianleiwu commented on 2023-10-12
aciddelgado add unit test and fix build
54f25264
github-advanced-security
github-advanced-security commented on 2023-10-12
aciddelgado undo work in attention_impl file
e2e1157e
tianleiwu
tianleiwu commented on 2023-10-18
tianleiwu
tianleiwu commented on 2023-10-18
tianleiwu
tianleiwu commented on 2023-10-18
tianleiwu
tianleiwu commented on 2023-10-18
tianleiwu
tianleiwu commented on 2023-10-18
tianleiwu
tianleiwu commented on 2023-10-18
aciddelgado reduce tests and change default behavior for past-kv is nullptr
415440b8
github-advanced-security
github-advanced-security commented on 2023-10-19
aciddelgado test compatibility w/ no cuda
0e84d2df
aciddelgado exclude from amd
9ba69638
aciddelgado fix test script
65731333
yufenglee
yufenglee commented on 2023-10-20
yufenglee
yufenglee commented on 2023-10-20
yufenglee
yufenglee commented on 2023-10-20
yufenglee
yufenglee commented on 2023-10-21
yufenglee
yufenglee commented on 2023-10-21
aciddelgado aciddelgado added release:1.16.2
aciddelgado make kernels more efficient and make present output required
a87f211d
aciddelgado Merge branch 'main' into aciddelgado/gqa_memeff_v2
16bda285
tianleiwu
tianleiwu commented on 2023-10-24
tianleiwu
tianleiwu commented on 2023-10-24
tianleiwu
tianleiwu commented on 2023-10-24
tianleiwu
tianleiwu commented on 2023-10-24
tianleiwu
tianleiwu commented on 2023-10-24
aciddelgado merge main and memeff changes
08f553df
aciddelgado address comments
eb125226
aciddelgado update ContribOperators.md
6c6aead9
faxu faxu added triage:approved
faxu faxu added sdxl_llama
yufenglee
yufenglee commented on 2023-10-26
yufenglee
yufenglee commented on 2023-10-26
yufenglee
yufenglee commented on 2023-10-26
yufenglee
yufenglee commented on 2023-10-27
aciddelgado clarify input and output formats memory efficient attention
db307f36
aciddelgado max sequence length for memory efficient attention
e7a50ee9
aciddelgado clang and fix test file
bc1cf0aa
tianleiwu
tianleiwu commented on 2023-10-27
tianleiwu
tianleiwu commented on 2023-10-27
aciddelgado undo clang on unrelated files
44ca857f
aciddelgado Merge branch 'main' into aciddelgado/gqa_memeff_v2
f08495f7
aciddelgado check value and key inputs
660c8fd5
tianleiwu
tianleiwu commented on 2023-10-31
tianleiwu
tianleiwu commented on 2023-10-31
tianleiwu
tianleiwu commented on 2023-10-31
aciddelgado key and value dont check for nullptr since they are required
b9c4d15d
tianleiwu
tianleiwu dismissed these changes on 2023-11-01
aciddelgado lint
fd0ecc3b
aciddelgado aciddelgado dismissed their stale review via fd0ecc3b 2 years ago
tianleiwu
tianleiwu approved these changes on 2023-11-01
tianleiwu tianleiwu merged 178f7caa into main 2 years ago
tianleiwu tianleiwu deleted the aciddelgado/gqa_memeff_v2 branch 2 years ago
tianleiwu tianleiwu removed triage:approved
tianleiwu tianleiwu removed release:1.16.2
tianleiwu tianleiwu removed sdxl_llama

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone