openvino
c890f24a - [NPU][NPUW] Fix UAF in pyramid attention shared tensor buffer. (#34871)

Commit
19 days ago
[NPU][NPUW] Fix UAF in pyramid attention shared tensor buffer. (#34871) ### Details: - In `setup_pyramid_infer_requests`, non-last pyramid infer requests share memory with `m_subrequests[real_idx]` via raw pointers wrapped in `ov::Tensor`. (see [here](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_npu/src/plugin/npuw/just_sync_infer_request.cpp#L1157)) - When `bind_pyramid_attention_inputs` calls `set_tensor` on `m_subrequests` for the last chunk (see the [fast path](https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_npu/src/plugin/npuw/base_sync_infer_request.cpp#L848)), the original tensor SoPtr refcount drops to zero, freeing the buffer while pyramid requests still hold raw pointers into it. - The next round inference then aborts when accessing the freed memory. - Fix: add `m_pyramid_anchor_tensors` to `JustInferRequest`, so that the shared tensor and `JustInferRequest` have the same life cycle. ### Tickets: - *[CVS-183325](https://jira.devtools.intel.com/browse/CVS-183325)* ### AI Assistance: - *AI assistance used: no / yes* - *If yes, summarize how AI was used and what human validation was performed (build/tests/manual checks).* Signed-off-by: intelgaoxiong <xiong.gao@intel.com>
Author
Parents
Loading