GQA Memory Efficient Kernel #17920
squash merge
ef82d4d3
aciddelgado
changed the title squash merge GQA Memory Efficient Kernel 2 years ago
add unit test and fix build
54f25264
undo work in attention_impl file
e2e1157e
reduce tests and change default behavior for past-kv is nullptr
415440b8
test compatibility w/ no cuda
0e84d2df
exclude from amd
9ba69638
fix test script
65731333
make kernels more efficient and make present output required
a87f211d
Merge branch 'main' into aciddelgado/gqa_memeff_v2
16bda285
merge main and memeff changes
08f553df
address comments
eb125226
update ContribOperators.md
6c6aead9
faxu
added triage:approved
faxu
added sdxl_llama
clarify input and output formats memory efficient attention
db307f36
max sequence length for memory efficient attention
e7a50ee9
clang and fix test file
bc1cf0aa
undo clang on unrelated files
44ca857f
Merge branch 'main' into aciddelgado/gqa_memeff_v2
f08495f7
check value and key inputs
660c8fd5
key and value dont check for nullptr since they are required
b9c4d15d
tianleiwu
dismissed these changes
on 2023-11-01
lint
fd0ecc3b
aciddelgado
dismissed their stale review
via fd0ecc3b
2 years ago
tianleiwu
approved these changes
on 2023-11-01
tianleiwu
merged
178f7caa
into main 2 years ago
tianleiwu
deleted the aciddelgado/gqa_memeff_v2 branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub