GQA Rotary and Packed QKV with Flash #18906
squash merge
ef82d4d3
add unit test and fix build
54f25264
undo work in attention_impl file
e2e1157e
reduce tests and change default behavior for past-kv is nullptr
415440b8
test compatibility w/ no cuda
0e84d2df
exclude from amd
9ba69638
fix test script
65731333
work on local attention flash
bd79b6d7
vscode idk
fb8e386d
make kernels more efficient and make present output required
a87f211d
Merge branch 'main' into aciddelgado/gqa_memeff_v2
16bda285
merge main and memeff changes
08f553df
address comments
eb125226
update ContribOperators.md
6c6aead9
merge main
6e540e33
Merge branch 'aciddelgado/gqa_memeff_v2' into aciddelgado/gqa_local
b3a9d0f3
local working with flash not memeff
3d7f3bf6
clarify input and output formats memory efficient attention
db307f36
max sequence length for memory efficient attention
e7a50ee9
clang and fix test file
bc1cf0aa
undo clang on unrelated files
44ca857f
Merge branch 'main' into aciddelgado/gqa_memeff_v2
f08495f7
check value and key inputs
660c8fd5
key and value dont check for nullptr since they are required
b9c4d15d
fix up test script
afe22a43
fix packedmha, clean test, merge gqa_memeff_v2 branch changes
01256785
Merge branch 'main' into aciddelgado/gqa_local
353c4f59
local working with recent changes
94b5efbe
no local w memeff
ddb7a66b
undo unnecessary changes
d33c69b6
undo change symbolic_shape_infer.py
26ca6d5e
fix pipeline
163a81ec
docs
4326c8db
make prompt to use the kv new
b0a1006c
clean up
23c22a33
update documentation
e97a6fda
Merge branch 'main' into aciddelgado/gqa_local
93cb019d
start work rotary
3c332f06
Merge branch 'main' into aciddelgado/gqa_rotary
b26b4cb8
rotary work
082f347d
Merge branch 'main' into aciddelgado/gqa_rotary
ae34d0cf
Merge branch 'yufeng/gqa_opt' into aciddelgado/gqa_rotary
791bbc3d
rotary fully implemented
6e6ad2cb
packed working
f67316d7
aciddelgado
changed the title Aciddelgado/gqa rotary packed GQA Rotary and Packed QKV with Flash 2 years ago
run formatters
80845859
enable gpu linux pipeline transformers tests
94afb76e
docs and pipeline test
40cfc268
Merge branch 'main' into aciddelgado/gqa_rotary_packed
a821588a
test
b4732832
run pipeline
152d920e
retrigger checks
f271a74e
conflict and requirements change
ba13a3f7
disable transformers test
723637ed
add todo and format
87533ef8
fix lint issue
615d500c
merge conflict
dba1e7e1
tianleiwu
dismissed these changes
on 2024-01-22
address comments
e7863b30
aciddelgado
dismissed their stale review
via e7863b30
2 years ago
tianleiwu
dismissed these changes
on 2024-01-22
lintrunner
5b554246
aciddelgado
dismissed their stale review
via 5b554246
2 years ago
tianleiwu
approved these changes
on 2024-01-23
aciddelgado
deleted the aciddelgado/gqa_rotary_packed branch 2 years ago
snnn
removed release:1.17.0
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub