vllm
[Attention] Add FlashInfer Sparse MLA backend
#33451
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
39
Changes
View On
GitHub
[Attention] Add FlashInfer Sparse MLA backend
#33451
LucasWilkinson
merged 39 commits into
vllm-project:main
from
MatthewBonanni:fi_sparse
Initial implementation
fbd9b57c
Add implementation
dee20921
Update test
aaca43b5
Remove unnecessary skip
6f3bbc9a
mergify
added
documentation
mergify
added
nvidia
mergify
added
v1
gemini-code-assist
commented on 2026-01-30
Cleanup
77d5461b
Merge branch 'main' into fi_sparse
a6072d8b
Address refactor
38e6fc99
Move kernel to sparse_utils
ead72ab6
Fix from refactor
49cd2094
Super init call not necessary
14044783
More fixes
349da878
Fix uniform query lengths
d4536e27
Update check_and_update_config
98530a27
Update block size support
bb4e94e5
Parameterize block sizes
ad8d06ed
Update benchmark
f604adb0
mergify
added
performance
Clean up
f281b288
Fix
9a6fb87c
Use single batch layout
5dcf0587
Remove unnecessary abstraction
9bdfa661
Fix indexing
9043242e
Improve test
1fb6fb01
mergify
added
needs-rebase
fix up attention benchmarks
c35a9c6c
clean
24a54002
add smoke
e8e0f4e8
Fix and update test
1daa3d0a
MatthewBonanni
marked this pull request as ready for review
106 days ago
MatthewBonanni
requested a review
from
pavanimajety
106 days ago
MatthewBonanni
requested a review
from
LucasWilkinson
106 days ago
Merge branch 'main' into fi_sparse
23780f19
mergify
removed
needs-rebase
Merge branch 'lwilkinson/fix-up-attention-benchmarks' into fi_sparse
f245674a
mergify
added
ci/build
Fix benchmarks
f326062c
More benchmark ux improvements
553b1401
Update mla_decode
1e29943f
Sort benchmark output
c19e1131
Add mla prefill case
5cb64d62
Prefer FlashInfer at low head counts
9a31d73f
MatthewBonanni
requested a review
from
hmellor
101 days ago
mergify
added
needs-rebase
Merge branch 'main' into fi_sparse
ab3fb88b
mergify
removed
needs-rebase
Update other platforms
6d7b2c32
MatthewBonanni
requested a review
from
tjtanaa
101 days ago
MatthewBonanni
requested a review
from
jikunshang
101 days ago
MatthewBonanni
requested a review
from
bigPYJ1151
101 days ago
mergify
added
rocm
mergify
added
cpu
Merge branch 'main' into fi_sparse
fa0655d3
LucasWilkinson
changed the title
[WIP][Attention] Add FlashInfer Sparse MLA backend
[Attention] Add FlashInfer Sparse MLA backend
101 days ago
mergify
added
needs-rebase
Merge branch 'main' into fi_sparse
db66e7bc
mergify
removed
needs-rebase
Merge branch 'main' into fi_sparse
c69830b2
LucasWilkinson
approved these changes on 2026-02-11
LucasWilkinson
enabled auto-merge (squash)
99 days ago
github-actions
added
ready
LucasWilkinson
merged
f2c47886
into main
98 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
LucasWilkinson
gemini-code-assist
pavanimajety
hmellor
tjtanaa
jikunshang
bigPYJ1151
Assignees
No one assigned
Labels
documentation
performance
rocm
ready
ci/build
v1
cpu
nvidia
Milestone
No milestone
Login to write a write a comment.
Login via GitHub