vllm
[GDN] Eliminate GPU->CPU sync in prepare_chunk_indices during prefill
#38361
Merged

[GDN] Eliminate GPU->CPU sync in prepare_chunk_indices during prefill #38361

arpera
arpera arpera requested a review from LucasWilkinson LucasWilkinson 55 days ago
arpera arpera requested a review from MatthewBonanni MatthewBonanni 55 days ago
claude
claude commented on 2026-03-27
mergify mergify added v1
gemini-code-assist
gemini-code-assist commented on 2026-03-27
mergify
mergify
arpera arpera force pushed 55 days ago
ZJY0516
ZJY0516 commented on 2026-03-27
mergify
mergify mergify added needs-rebase
arpera arpera force pushed 54 days ago
mergify mergify removed needs-rebase
arpera
vadiklyutiy
gemini-code-assist
gemini-code-assist commented on 2026-03-28
vadiklyutiy
vadiklyutiy commented on 2026-03-28
vadiklyutiy
claude
claude commented on 2026-03-28
vadiklyutiy
arpera
arpera arpera requested a review from tdoublep tdoublep 52 days ago
vadiklyutiy
mergify
arpera
vadiklyutiy
claude
claude commented on 2026-03-30
arpera
ZJY0516
ZJY0516
claude
claude commented on 2026-03-30
arpera
vadiklyutiy
vadiklyutiy commented on 2026-03-31
vadiklyutiy
vadiklyutiy vadiklyutiy added ready
vadiklyutiy
vadiklyutiy approved these changes on 2026-03-31
arpera [GDN] Eliminate GPU->CPU sync in prepare_chunk_indices during prefill
6386a1d5
arpera Fix gemini-code issues: extract _insert helper in tensor_cache, add T…
8ef70dc1
arpera Fix mypy: add type: ignore for dynamic register attribute
b21bb569
arpera Extract hardcoded chunk_size=64 into FLA_CHUNK_SIZE constant
4e683f22
arpera Fix: skip chunk_indices pre-registration on pure decode steps
a9a72482
arpera [GDN] Pre-compute chunk_indices/chunk_offsets in metadata builder
83ceaad6
arpera Remove dead register() code and duplicate prefill block
22c9779e
arpera Fix Claude review comments: backend guard, kda chunk_size, BT simplif…
62dc2f51
arpera Remove use_flashinfer backend guard for chunk_indices pre-computation
27608149
arpera arpera force pushed to 27608149 51 days ago
mgoin mgoin added nvidia
mergify
mergify mergify added needs-rebase
vadiklyutiy Merge branch 'main' into artem/remove-extra-d2h-copy
dae956b2
mergify mergify removed needs-rebase
vadiklyutiy
arpera
mergify
mergify mergify added needs-rebase
arpera Merge branch 'main' into artem/remove-extra-d2h-copy
c47aa23c
arpera fix CI: lazy-import FLA ops to avoid CUDA init in forked subprocess
e1ab7a7b
arpera arpera force pushed to e1ab7a7b 48 days ago
mergify mergify removed needs-rebase
vadiklyutiy vadiklyutiy enabled auto-merge (squash) 48 days ago
arpera Merge branch 'main' into artem/remove-extra-d2h-copy
1414fc8c
arpera Merge branch 'main' into artem/remove-extra-d2h-copy
1a86ef61
vadiklyutiy vadiklyutiy merged cb10b7e8 into main 48 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone