openvino
eac9970a - [NPUW]Optimize token rate for dynamic LoRA. (#31742)

Commit
234 days ago
[NPUW]Optimize token rate for dynamic LoRA. (#31742) ### Details: 1. Optimize dynamic LoRA token rate by L0 remote tensor pre-allocation. 2. Solve review comments in https://github.com/openvinotoolkit/openvino/pull/31433 3. Move some comment functions to utils. ### Tickets: - *[EISW-179843](https://jira.devtools.intel.com/browse/EISW-179843)* --------- Signed-off-by: intelgaoxiong <xiong.gao@intel.com>
Author
Parents
Loading