onnxruntime
GQA Flash Attention with Attention Mask
#18283
Merged

GQA Flash Attention with Attention Mask #18283

aciddelgado merged 51 commits into main from aciddelgado/gqa_seqlens_k
aciddelgado
aciddelgado squash merge
ef82d4d3
aciddelgado add unit test and fix build
54f25264
aciddelgado undo work in attention_impl file
e2e1157e
aciddelgado reduce tests and change default behavior for past-kv is nullptr
415440b8
aciddelgado test compatibility w/ no cuda
0e84d2df
aciddelgado exclude from amd
9ba69638
aciddelgado fix test script
65731333
aciddelgado make kernels more efficient and make present output required
a87f211d
aciddelgado Merge branch 'main' into aciddelgado/gqa_memeff_v2
16bda285
aciddelgado merge main and memeff changes
08f553df
aciddelgado address comments
eb125226
aciddelgado update ContribOperators.md
6c6aead9
aciddelgado clarify input and output formats memory efficient attention
db307f36
aciddelgado max sequence length for memory efficient attention
e7a50ee9
aciddelgado clang and fix test file
bc1cf0aa
aciddelgado undo clang on unrelated files
44ca857f
aciddelgado Merge branch 'main' into aciddelgado/gqa_memeff_v2
f08495f7
aciddelgado check value and key inputs
660c8fd5
aciddelgado key and value dont check for nullptr since they are required
b9c4d15d
aciddelgado lint
fd0ecc3b
aciddelgado start work on support for right pad gqa
59b8aa59
aciddelgado attention mask for flash attention with cache
afa8ea01
aciddelgado flash attention works no buffer
4c5a32ab
aciddelgado disable memory efficient and pipeline test
90f23c0d
aciddelgado aciddelgado requested a review from tianleiwu tianleiwu 2 years ago
aciddelgado aciddelgado requested a review from yufenglee yufenglee 2 years ago
tianleiwu
tianleiwu commented on 2023-11-04
tianleiwu
tianleiwu commented on 2023-11-04
tianleiwu
tianleiwu commented on 2023-11-04
aciddelgado merge main
40f6e3b5
github-advanced-security
github-advanced-security commented on 2023-11-04
aciddelgado fix warning and lint
e2eadab7
tianleiwu
tianleiwu commented on 2023-11-04
tianleiwu
tianleiwu commented on 2023-11-04
aciddelgado build warnings
d89995f0
yufenglee
yufenglee commented on 2023-11-04
yufenglee
yufenglee commented on 2023-11-04
yufenglee
yufenglee commented on 2023-11-05
yufenglee
yufenglee commented on 2023-11-05
yufenglee
yufenglee commented on 2023-11-05
yufenglee
yufenglee commented on 2023-11-05
aciddelgado remove kv share flag and bnsh flag
9e2fae76
aciddelgado address comments
2fab38e7
aciddelgado docs
791621b0
aciddelgado undo cmake/external/onnx change
66f1600d
aciddelgado eigen update
6d754f68
aciddelgado deps update
b5b62cdc
aciddelgado aciddelgado requested a review 2 years ago
yufenglee
yufenglee commented on 2023-11-05
yufenglee
yufenglee commented on 2023-11-05
yufenglee
yufenglee commented on 2023-11-05
yufenglee
yufenglee commented on 2023-11-05
aciddelgado change seqlen logic
bcd126ae
yufenglee
yufenglee commented on 2023-11-05
yufenglee
yufenglee commented on 2023-11-05
yufenglee
yufenglee commented on 2023-11-05
yufenglee
yufenglee commented on 2023-11-05
aciddelgado comments and docs
96b02705
yufenglee
yufenglee commented on 2023-11-05
yufenglee fix seqlen_k for prompt
11777fe0
aciddelgado test cases
070a2f0e
tianleiwu
tianleiwu commented on 2023-11-06
tianleiwu tianleiwu added release:1.16.2
tianleiwu tianleiwu added sdxl_llama
yufenglee add memory efficient support
0d8786a1
aciddelgado fix lp
b1ed3efa
aciddelgado Merge branch 'aciddelgado/gqa_seqlens_k_lp_make_it_right' into acidde…
20dd3303
aciddelgado left padding thing
4d2f37e0
aciddelgado Merge branch 'main' into aciddelgado/gqa_seqlens_k
4137c055
aciddelgado lint
70de4f4b
yufenglee
yufenglee commented on 2023-11-07
yufenglee
yufenglee commented on 2023-11-07
yufenglee
yufenglee commented on 2023-11-07
yufenglee
yufenglee commented on 2023-11-07
aciddelgado fix left padding and make seqlens clearer
e9672f82
aciddelgado fix left padding indexing issue
7afaa455
yufenglee fix build break on windows
d9e2a8fd
yufenglee Merge branch 'aciddelgado/gqa_seqlens_k' of github.com:microsoft/onnx…
8a7c57fb
yufenglee
yufenglee commented on 2023-11-07
aciddelgado nit and documentation
4801fe82
aciddelgado remove padding
1e1a25c3
aciddelgado docs
4b3bec16
aciddelgado lint
4a968818
yufenglee
yufenglee approved these changes on 2023-11-07
aciddelgado aciddelgado merged 3dece27f into main 2 years ago
aciddelgado aciddelgado deleted the aciddelgado/gqa_seqlens_k branch 2 years ago
tianleiwu tianleiwu removed release:1.16.2
tianleiwu tianleiwu removed sdxl_llama

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone