openvino
[NPUW] Share kvcache between prefill and generate when chunking is enabled
#32642
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
34
Changes
View On
GitHub
[NPUW] Share kvcache between prefill and generate when chunking is enabled
#32642
dmatveev
merged 34 commits into
openvinotoolkit:master
from
smirnov-alexey:as/npuw_share_kvcache
Introduce lazy memory allocation for ireq's I/O
fcceeb72
Fix no tensor being present in the storage
f01b8d58
Merge branch 'master' of https://github.com/openvinotoolkit/openvino …
f6fbf64d
Address review comments
bd27bbc7
Merge branch 'master' of https://github.com/openvinotoolkit/openvino …
b258ed06
Merge branch 'master' of https://github.com/openvinotoolkit/openvino …
bd03f343
Copy Xiong's changes
7dab40d1
smirnov-alexey
requested a review
from
dmatveev
202 days ago
smirnov-alexey
requested a review
from
intelgaoxiong
202 days ago
smirnov-alexey
assigned
dmatveev
202 days ago
smirnov-alexey
requested a review
202 days ago
smirnov-alexey
requested a review
202 days ago
smirnov-alexey
added
do_not_review
github-actions
added
category: NPU
github-actions
added
category: NPUW
smirnov-alexey
commented on 2025-10-31
dmatveev
added this to the
2026.0
milestone
202 days ago
Remove copy
20af381a
intelgaoxiong
commented on 2025-11-04
WIP
a9e2029c
Fix concurrency issue with iterator invalidation
a54d529c
Merge branch 'master' of https://github.com/openvinotoolkit/openvino …
ec23004b
Refactoring
a09921fa
Fix merge
8146143e
Protect get_tensor by mutex
42b7c1d4
Merge branch 'as/npuw_lazy_io_alloc' of https://github.com/smirnov-al…
cf01cf83
Disable kv cache sharing when one of the models is transposed
430af31b
intelgaoxiong
commented on 2025-11-06
Fix strides
4ab490b8
esmirno
approved these changes on 2025-11-07
Handle strided tensors - copy on host
05895d99
Merge branch 'master' of https://github.com/openvinotoolkit/openvino …
5ececfc6
Handle strided tensors in pyramid attention
2183e6a9
intelgaoxiong
approved these changes on 2025-11-10
smirnov-alexey
commented on 2025-11-11
smirnov-alexey
commented on 2025-11-11
smirnov-alexey
commented on 2025-11-11
smirnov-alexey
commented on 2025-11-11
smirnov-alexey
commented on 2025-11-11
Address review comments
ea2b7a06
Merge branch 'as/npuw_lazy_io_alloc' of https://github.com/smirnov-al…
c2d6cb7a
Address review comments
afa41468
Merge branch 'master' into as/npuw_lazy_io_alloc
3c1120d1
Merge branch 'master' into as/npuw_share_kvcache
a76cea0e
Fix shape
db789400
dmatveev
removed
do_not_review
dmatveev
added
do not merge
Merge branch 'master' of https://github.com/openvinotoolkit/openvino …
0b84f331
Remove is_io()
cace85bf
Merge branch 'as/npuw_lazy_io_alloc' of https://github.com/smirnov-al…
25262866
Move kv cache copy to second token time
9088de64
Fix empty tensors for pyramid attention
ef2d3659
Merge branch 'as/npuw_share_kvcache' of https://github.com/smirnov-al…
494672af
Merge branch 'master' into as/npuw_share_kvcache
a69af6b1
smirnov-alexey
commented on 2025-11-27
Align code with latest pyramid changes merged
f7d75f4f
dmatveev
removed
do not merge
dmatveev
approved these changes on 2025-11-27
dmatveev
merged
3797c559
into master
175 days ago
dmatveev
deleted the as/npuw_share_kvcache branch
175 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
dmatveev
intelgaoxiong
esmirno
Assignees
dmatveev
Labels
category: NPU
category: NPUW
Milestone
2026.0
Login to write a write a comment.
Login via GitHub