vllm
[NVIDIA] Update FlashInfer to version 0.6.7.post3. Avoid re-downloading BMM export headers when flashinfer-cubin is installed
#38913
Open

[NVIDIA] Update FlashInfer to version 0.6.7.post3. Avoid re-downloading BMM export headers when flashinfer-cubin is installed #38913

johnnynunez wants to merge 10 commits into vllm-project:main from johnnynunez:main
johnnynunez
johnnynunez johnnynunez force pushed from 6ea66ba0 to db0d3fd2 3 days ago
mergify mergify added ci/build
mergify mergify added nvidia
johnnynunez johnnynunez changed the title Update FlashInfer to version 0.6.7.post1 in Dockerfiles and related f… [NVIDIA] Update FlashInfer to version 0.6.7.post1 3 days ago
johnnynunez
gemini-code-assist
gemini-code-assist commented on 2026-04-03
johnnynunez johnnynunez changed the title [NVIDIA] Update FlashInfer to version 0.6.7.post1 [NVIDIA] Update FlashInfer to version 0.6.7.post1. Hot fix for DGX Spark 3 days ago
mgoin
mgoin approved these changes on 2026-04-03
mgoin mgoin added ready
mgoin mgoin added ready-run-all-tests
johnnynunez johnnynunez changed the title [NVIDIA] Update FlashInfer to version 0.6.7.post1. Hot fix for DGX Spark [NVIDIA] Update FlashInfer to version 0.6.7.post1. Avoid re-downloading BMM export headers when flashinfer-cubin is installed 3 days ago
johnnynunez Update FlashInfer to version 0.6.7.post1 in Dockerfiles and related f…
26bbbaa1
johnnynunez Remove pre-download step for FlashInfer TRTLLM BMM headers in Dockerfile
0e7b5ed8
johnnynunez johnnynunez force pushed from 1acabfaa to 0e7b5ed8 3 days ago
johnnynunez Merge branch 'main' into main
0a459b3f
johnnynunez Merge branch 'vllm-project:main' into main
88f7c9be
cjackal
johnnynunez 0.6.7.post2
e6a85912
johnnynunez Merge branch 'vllm-project:main' into main
12dcd479
johnnynunez johnnynunez changed the title [NVIDIA] Update FlashInfer to version 0.6.7.post1. Avoid re-downloading BMM export headers when flashinfer-cubin is installed [NVIDIA] Update FlashInfer to version 0.6.7.post2. Avoid re-downloading BMM export headers when flashinfer-cubin is installed 2 days ago
johnnynunez
johnnynunez Add startup_max_wait_seconds parameter to Llama-4-Scout-BF16-fi-cutla…
e6266b5c
johnnynunez Merge remote-tracking branch 'origin/main'
0a85d8d1
johnnynunez johnnynunez requested a review from vadiklyutiy vadiklyutiy 2 days ago
johnnynunez
johnnynunez johnnynunez closed this 1 day ago
johnnynunez johnnynunez reopened this 11 hours ago
johnnynunez johnnynunez changed the title [NVIDIA] Update FlashInfer to version 0.6.7.post2. Avoid re-downloading BMM export headers when flashinfer-cubin is installed [NVIDIA] Update FlashInfer to version 0.6.7.post3. Avoid re-downloading BMM export headers when flashinfer-cubin is installed 11 hours ago
johnnynunez Merge branch 'vllm-project:main' into main
656b6cac
johnnynunez Update FlashInfer to version 0.6.7.post3 in Dockerfiles and related f…
a4c2278c

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone