vllm
[Update] Use FlashInfer fast_decode_plan directly instead of replication
#34687
Merged

[Update] Use FlashInfer fast_decode_plan directly instead of replication #34687

pavanimajety merged 9 commits into vllm-project:main from askliar:main
askliar
askliar askliar requested a review from mgoin mgoin 87 days ago
askliar askliar requested a review from pavanimajety pavanimajety 87 days ago
mergify mergify added nvidia
Refactor FlashInfer metadata handling and decoding parameters in flas…
c654c373
mergify mergify added v1
askliar askliar force pushed to c654c373 87 days ago
askliar askliar changed the title [Update] Use FlashInfer fast_decode_plan directly instead of replication#32182 [Update] Use FlashInfer fast_decode_plan directly instead of replication 87 days ago
gemini-code-assist
gemini-code-assist commented on 2026-02-17
mergify
Add a blank line for improved readability in flashinfer.py
4e749384
pavanimajety
Merge branch 'main' of https://github.com/vllm-project/vllm
65247d83
Add tests for fast_plan_decode functionality in flashinfer
4749656b
askliar askliar requested a review from tlrmchlsmth tlrmchlsmth 85 days ago
askliar askliar requested a review from WoosukKwon WoosukKwon 85 days ago
askliar askliar requested a review from yewentao256 yewentao256 85 days ago
askliar
mgoin mgoin added ready
mgoin mgoin requested a review from LucasWilkinson LucasWilkinson 79 days ago
mgoin
askliar
Enhance fast_plan_decode in flashinfer to support tensor-core specifi…
16583f52
Refactor fast_plan_decode and update tests for non-tensor-core support
f713140f
askliar
pavanimajety
Merge branch 'main' of https://github.com/vllm-project/vllm
62eca944
Remove deprecated test for non-tensor core GQA in FlashInfer, simplif…
49a2bca4
askliar
pavanimajety
pavanimajety approved these changes on 2026-02-26
askliar Merge branch 'main' into main
8e0d95de
pavanimajety pavanimajety merged 56a63717 into main 77 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone