vllm
[Attention] Flash MLA for V1
#13867
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
42
Changes
View On
GitHub
[Attention] Flash MLA for V1
#13867
mgoin
merged 42 commits into
vllm-project:main
from
neuralmagic:lwilkinson/flash-mla-v1
[Attention] MLA support for V1
998803e7
torch library bindings, unit tests running
12a5221e
comments
38076021
working in eager mode
955cead8
format
1d5c8680
cuda-graphs still broken but closer i think
eae47876
better comments
c79927d1
remove extra files
37c4f9e6
add attribution
084b031b
fix cuda graphs
07a9bad5
cleaner build fallbacks
4dc8c352
ok cuda-graphs actually fixed now I think
a6c36cc2
format
3ae4a6ef
fix deepseek-v2
a6213a48
Merge branch 'lwilkinson/fix-deepseek-v2' into lwilkinson/flashmla-in…
68895a20
clean up
5e7cd970
Merge remote-tracking branch 'origin/main' into lwilkinson/flashmla-i…
8bb3bdc9
review comment
d4399691
fix mypy
aa42226e
review comments
d18261cf
cleanup
4c08a0a8
fix bad logic
07332bfe
review comments
c4434d9e
update to latest flashMLA which supports fp16
f570fe0e
update to use fork
0bbcf279
remove unnessary include
177ee292
add fp16 source
642456fe
missing symbol
2fa62a9d
[Attention] MLA support for V1
4b7ef4d0
Merge remote-tracking branch 'yang/mla-v1' into lwilkinson/flash-mla-v1
0ae026a5
address review feedback
23c780ff
restore to use attn_module.head_size
29c06c7b
wip v1 FlashMLA
5f8526b6
Merge remote-tracking branch 'yang/mla-v1' into lwilkinson/flash-mla-v1
f9551648
mergify
added
ci/build
mergify
added
v1
mergify
added
needs-rebase
[Attention] MLA support for V1
04c8db41
address review feedback
a456e058
restore to use attn_module.head_size
867d2ede
included more fixes from Lucas
8715cfbe
addressed feedback from Woosuk Kwon
6bf7bfbc
LucasWilkinson
force pushed
from
f9551648
to
c63464de
288 days ago
mergify
removed
needs-rebase
Merge remote-tracking branch 'yang/mla-v1' into lwilkinson/flash-mla-v1
dab8ad6d
LucasWilkinson
force pushed
from
c63464de
to
dab8ad6d
288 days ago
Merge remote-tracking branch 'origin/main' into lwilkinson/flash-mla-v1
67b2b628
cleanup
e6e57899
LucasWilkinson
marked this pull request as ready for review
287 days ago
LucasWilkinson
requested a review
from
WoosukKwon
287 days ago
LucasWilkinson
requested a review
from
robertgshaw2-redhat
287 days ago
LucasWilkinson
requested a review
from
njhill
287 days ago
LucasWilkinson
requested a review
from
ywang96
287 days ago
LucasWilkinson
requested a review
from
comaniac
287 days ago
LucasWilkinson
requested a review
from
alexm-redhat
287 days ago
mgoin
added
ready
mgoin
approved these changes on 2025-02-27
mgoin
enabled auto-merge (squash)
287 days ago
mgoin
merged
2e94b9cf
into main
287 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
mgoin
WoosukKwon
robertgshaw2-redhat
njhill
ywang96
comaniac
alexm-redhat
Assignees
No one assigned
Labels
ready
ci/build
v1
Milestone
No milestone
Login to write a write a comment.
Login via GitHub