[NPUW] Kokoro updates (#35390)

Commit

11 days ago

[NPUW] Kokoro updates (#35390) ### Details: - [Switch to NPUW NONE partitioning for both kokoro parts](https://github.com/openvinotoolkit/openvino/pull/35390/commits/3f8ba6e93b22f6d07ddf7981dcfa3c0e79553877) to improve inference performance (though increasing compilation time) - also fix NPUW model caching for model A - [Fix issue of LSTMSquences using wrong (full 512) input_length value](https://github.com/openvinotoolkit/openvino/pull/35390/commits/a0a90d7e687987698863668f0929228044810dfb) - replace input_lengths value with Parameter - set input_lengths to the real sequence length - add m_a_input_lengths parameter for KokoroInferRequest to manage the parameter - [set NPUW_FALLBACK_EXEC=false for Kokoro model to avoid caching problem](https://github.com/openvinotoolkit/openvino/pull/35390/commits/fd9c4240593e253ee49b90ac1e679c06e5419209) discussed in https://github.com/openvinotoolkit/openvino/pull/35635 Metrics: - NPU4000: - Compilation for NPU: - w/o PR: - cold: 60s - PR: - cold: 52s - warm: 0.6s - Performance: - baseline - CPU Pytorch - Real-time Speedup Factor: 4.8x - CPU OpenVINO - Real-time Speedup Factor: 3.7x - w/o PR: - NPU - Real-time Speedup Factor: 8x - PR: - NPU - Real-time Speedup Factor: 8.1x (Real-time Speedup Factor = Generated Audio Length / Inference Time) ### Tickets: E210201 E211208 E213246 ### AI Assistance: - *AI assistance used: yes* - replace_input_lengths_with_parameter() - comments --------- Co-authored-by: Sergey Shumihin <sergey.shumihin@intel.com>

References

#35390 - [NPUW] Kokoro updates

Author

OrestChura

Parents

9f7b20e4

openvino 50439fc0 - [NPUW] Kokoro updates (#35390)

openvino
50439fc0 - [NPUW] Kokoro updates (#35390)