openvino
[NPUW]Optimize token rate for dynamic LoRA.
#31742
Merged

Commits
  • Reuse infer request buffer for LoRA.
    intelgaoxiong committed 303 days ago
  • Solved review comments in pr#31433.
    intelgaoxiong committed 303 days ago
  • Fixed for CI.
    intelgaoxiong committed 303 days ago
  • Set remote tensor for kvcache and prefill
    intelgaoxiong committed 303 days ago
  • Choose device for pre-allocation.
    intelgaoxiong committed 303 days ago
  • Move lora name matching functions to utils.
    intelgaoxiong committed 303 days ago
  • Move allocMem to util.
    intelgaoxiong committed 303 days ago
  • Minor change.
    intelgaoxiong committed 303 days ago
  • No need to bump the serialization version.
    intelgaoxiong committed 303 days ago
Loading