vllm
[Update] Use FlashInfer fast_decode_plan directly instead of replication
#34687
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
9
Changes
View On
GitHub
[Update] Use FlashInfer fast_decode_plan directly instead of replication
#34687
pavanimajety
merged 9 commits into
vllm-project:main
from
askliar:main
askliar
requested a review
from
mgoin
87 days ago
askliar
requested a review
from
pavanimajety
87 days ago
mergify
added
nvidia
Refactor FlashInfer metadata handling and decoding parameters in flas…
c654c373
mergify
added
v1
askliar
force pushed
to
c654c373
87 days ago
askliar
changed the title
[Update] Use FlashInfer fast_decode_plan directly instead of replication#32182
[Update] Use FlashInfer fast_decode_plan directly instead of replication
87 days ago
gemini-code-assist
commented on 2026-02-17
Add a blank line for improved readability in flashinfer.py
4e749384
Merge branch 'main' of https://github.com/vllm-project/vllm
65247d83
Add tests for fast_plan_decode functionality in flashinfer
4749656b
askliar
requested a review
from
tlrmchlsmth
85 days ago
askliar
requested a review
from
WoosukKwon
85 days ago
askliar
requested a review
from
yewentao256
85 days ago
mgoin
added
ready
mgoin
requested a review
from
LucasWilkinson
79 days ago
Enhance fast_plan_decode in flashinfer to support tensor-core specifi…
16583f52
Refactor fast_plan_decode and update tests for non-tensor-core support
f713140f
Merge branch 'main' of https://github.com/vllm-project/vllm
62eca944
Remove deprecated test for non-tensor core GQA in FlashInfer, simplif…
49a2bca4
pavanimajety
approved these changes on 2026-02-26
Merge branch 'main' into main
8e0d95de
pavanimajety
merged
56a63717
into main
77 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
pavanimajety
gemini-code-assist
mgoin
tlrmchlsmth
WoosukKwon
yewentao256
LucasWilkinson
Assignees
No one assigned
Labels
ready
v1
nvidia
Milestone
No milestone
Login to write a write a comment.
Login via GitHub