openvino
[NPUW] Support prefill-chunk for text-embedding model
#33076
Merged

[NPUW] Support prefill-chunk for text-embedding model #33076

mengweiguo
github-actions github-actions added category: NPU
github-actions github-actions added category: NPUW
sys-openvino-ci sys-openvino-ci added ExternalIntelPR
mengweiguo mengweiguo force pushed 199 days ago
mengweiguo mengweiguo changed the title Support prefill-chunk for text-embedding model NPUW] Support prefill-chunk for text-embedding model 198 days ago
mengweiguo mengweiguo changed the title NPUW] Support prefill-chunk for text-embedding model [NPUW] Support prefill-chunk for text-embedding model 198 days ago
mengweiguo mengweiguo marked this pull request as ready for review 198 days ago
mengweiguo mengweiguo requested a review 198 days ago
mengweiguo mengweiguo requested a review 198 days ago
mengweiguo mengweiguo force pushed 196 days ago
mengweiguo mengweiguo force pushed 194 days ago
mengweiguo mengweiguo force pushed 186 days ago
mengweiguo
dmatveev
dmatveev commented on 2025-12-16
AlexanderKalistratov
AlexanderKalistratov commented on 2025-12-05
mengweiguo
mengweiguo mengweiguo force pushed 180 days ago
mengweiguo mengweiguo force pushed 179 days ago
mengweiguo
mengweiguo mengweiguo requested a review from dmatveev dmatveev 179 days ago
mengweiguo mengweiguo requested a review from AlexanderKalistratov AlexanderKalistratov 179 days ago
AlexanderKalistratov
AlexanderKalistratov commented on 2025-12-23
mengweiguo
mengweiguo mengweiguo force pushed 178 days ago
AlexanderKalistratov
AlexanderKalistratov commented on 2025-12-24
AlexanderKalistratov
AlexanderKalistratov commented on 2025-12-24
mengweiguo Support prefill-chunk for text-embedding model
dc8bb51a
mengweiguo code cleanup
e3908be9
mengweiguo Add option `normalize` support
25761266
mengweiguo Adjust padding side
23a3f644
mengweiguo Move data to left side
cf78e19d
mengweiguo Fix CPU fallbakc issue.
a883d8a0
mengweiguo Remove post model and cache chunk output
92d3403e
mengweiguo Put Lora out
6ac98b50
mengweiguo Fix conflict
8d11919a
mengweiguo Rafactor compiled-model and infer-request
29b8ec77
mengweiguo Add `LLMInferBaseRequest` as base request for LLM and Embedding
fd1f7895
mengweiguo Update serialization version `0.16->0.17`
d1cc2c8a
mengweiguo Rebuild model pass and add model check
f9eab5de
mengweiguo Remove `input_token_ids`
c975df36
mengweiguo Code cleanup
a459661b
dmatveev dmatveev added this to the 2026.0 milestone 168 days ago
dmatveev
dmatveev commented on 2026-01-02
mengweiguo Move `pad_position_ids` to `infer_request_utils.hpp`
9764a0b3
mengweiguo mengweiguo force pushed to 9764a0b3 166 days ago
mengweiguo mengweiguo requested a review from dmatveev dmatveev 162 days ago
mengweiguo mengweiguo requested a review from AlexanderKalistratov AlexanderKalistratov 162 days ago
AlexanderKalistratov
AlexanderKalistratov approved these changes on 2026-01-08
AlexanderKalistratov AlexanderKalistratov enabled auto-merge 162 days ago
dmatveev
mengweiguo
AlexanderKalistratov
mengweiguo
AlexanderKalistratov AlexanderKalistratov merged e5c24ae9 into master 156 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone