vllm
Enable prefix caching with full cuda graphs
#19617

Merged

Enable prefix caching with full cuda graphs #19617

WoosukKwon merged 4 commits into main from full-cuda-graph-prefix-caching

WoosukKwon

WoosukKwon

[Bugfix] Enable prefix caching with full cuda graphs

e8f07e50

WoosukKwon

minor

b10f335f

github-actions

gemini-code-assist

gemini-code-assist commented on 2025-06-13

WoosukKwon

WoosukKwon changed the title ~~Full cuda graph prefix caching~~ Enable prefix caching with full cuda graphs 218 days ago

WoosukKwon

WoosukKwon added ready

gemini-code-assist

gemini-code-assist commented on 2025-06-13

mergify

mergify

mergify added needs-rebase

houseroad

houseroad commented on 2025-06-14

WoosukKwon

merge

23b2d387

WoosukKwon

WoosukKwon requested a review from

hmellor

hmellor 217 days ago

WoosukKwon

WoosukKwon requested a review from

njhill

njhill 217 days ago

WoosukKwon

WoosukKwon requested a review from

LiuXiaoxuanPKU

LiuXiaoxuanPKU 217 days ago

WoosukKwon

WoosukKwon requested a review from

alexm-redhat

alexm-redhat 217 days ago

WoosukKwon

WoosukKwon requested a review from

comaniac

comaniac 217 days ago

WoosukKwon

WoosukKwon requested a review from

robertgshaw2-redhat

robertgshaw2-redhat 217 days ago

WoosukKwon

WoosukKwon requested a review from

ywang96

ywang96 217 days ago

WoosukKwon

WoosukKwon requested a review from

tlrmchlsmth

tlrmchlsmth 217 days ago

WoosukKwon

WoosukKwon requested a review from

aarnphm

aarnphm 217 days ago

WoosukKwon

Merge branch 'main' into full-cuda-graph-prefix-caching

e56aad45

mergify

mergify added documentation

mergify

mergify added ci/build

mergify

mergify added frontend

mergify

mergify added llama

mergify

mergify added rocm

mergify

mergify added structured-output

mergify

mergify added speculative-decoding

mergify

mergify added v1

mergify

mergify removed needs-rebase

ywang96

ywang96 approved these changes on 2025-06-15

WoosukKwon

WoosukKwon

WoosukKwon enabled auto-merge (squash) 216 days ago

disabled auto-merge 216 days ago
Manually disabled by user

WoosukKwon

WoosukKwon merged 055915e6 into main 216 days ago

WoosukKwon

WoosukKwon deleted the full-cuda-graph-prefix-caching branch 216 days ago

Login to write a write a comment.

Login via GitHub

Reviewers

ywang96

ywang96

houseroad

houseroad

gemini-code-assist

gemini-code-assist

hmellor

hmellor

njhill

njhill

LiuXiaoxuanPKU

LiuXiaoxuanPKU

alexm-redhat

alexm-redhat

comaniac

comaniac

robertgshaw2-redhat

robertgshaw2-redhat

tlrmchlsmth

tlrmchlsmth

aarnphm

aarnphm

Assignees

No one assigned

Labels

documentation rocm structured-output frontend speculative-decoding ready ci/build v1 llama

Milestone

No milestone