onnxruntime
138a3a35 - Set shared memory type based on options during the compilation phase (#24196)

Commit
329 days ago
Set shared memory type based on options during the compilation phase (#24196) ### Description During inference, using the QNN EP option to set enable_htp_shared_memory_allocator gives a hint that we use RPC allocated buffers to avoid buffer copy between CPU and NPU. With the current PR, we add hints in the compilation phase that if RPC memory is going to be used, any additional allocations done on the CPU can be avoided. ### Motivation and Context This should help reduce the peak CPU memory consumption while running AI work loads using shared memory. Related PR: https://github.com/microsoft/onnxruntime/pull/23136 Co-authored-by: Ashish Garg (AISW) <ashigarg@qti.qualcomm.com>
Author
Parents
Loading