vllm
[PD][HeteroArch]Fix accuracy issue with CPU_ATTN as Decoder and Flash_ATTN as prefiller
#38935
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
6
Changes
View On
GitHub
[PD][HeteroArch]Fix accuracy issue with CPU_ATTN as Decoder and Flash_ATTN as prefiller
#38935
bigPYJ1151
merged 6 commits into
vllm-project:main
from
xuechendi:heter_pd_with_cpu_attn_as_decoder
add post_process path for CPU
0ca6a24d
Enable pack_kv_cache for CPU
736a3d89
xuechendi
requested a review
from
bigPYJ1151
8 days ago
xuechendi
requested a review
from
NickLucche
8 days ago
xuechendi
requested a review
from
ApostaC
8 days ago
xuechendi
requested a review
from
orozery
8 days ago
mergify
added
intel-gpu
mergify
added
cpu
mergify
added
kv-connector
gemini-code-assist
commented on 2026-04-03
Add a skip for HMA
236f573d
Move n,h,d fetch to platform pack_kv_cache
ce3ca08f
bigPYJ1151
approved these changes on 2026-04-07
bigPYJ1151
added
ready
Merge remote-tracking branch 'origin/main' into heter_pd_with_cpu_att…
45367352
Fix UT
260a3acd
mergify
added
v1
bigPYJ1151
merged
ef5a2268
into main
2 days ago
NickLucche
commented on 2026-04-09
Login to write a write a comment.
Login via GitHub
Reviewers
bigPYJ1151
NickLucche
gemini-code-assist
ApostaC
orozery
Assignees
No one assigned
Labels
intel-gpu
ready
v1
cpu
kv-connector
Milestone
No milestone
Login to write a write a comment.
Login via GitHub