openvino
[NPUW]Optimize token rate for dynamic LoRA.
#31742
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
9
Changes
View On
GitHub
Commits
Reuse infer request buffer for LoRA.
intelgaoxiong
committed
303 days ago
Solved review comments in pr#31433.
intelgaoxiong
committed
303 days ago
Fixed for CI.
intelgaoxiong
committed
303 days ago
Set remote tensor for kvcache and prefill
intelgaoxiong
committed
303 days ago
Choose device for pre-allocation.
intelgaoxiong
committed
303 days ago
Move lora name matching functions to utils.
intelgaoxiong
committed
303 days ago
Move allocMem to util.
intelgaoxiong
committed
303 days ago
Minor change.
intelgaoxiong
committed
303 days ago
No need to bump the serialization version.
intelgaoxiong
committed
303 days ago
Loading