onnxruntime
a6592fc0 - Cleanup: Consolidate `OpKernel::UseSharePrePackedBuffers_V2` and `OpKernel::UseSharePrePackedBuffers` (#27924)

Commit

3 days ago

Cleanup: Consolidate `OpKernel::UseSharePrePackedBuffers_V2` and `OpKernel::UseSharePrePackedBuffers` (#27924) ### Description Consolidate `OpKernel::UseSharedPrePackedBuffers` and `OpKernel::UseSharedPrePackedBuffers_V2` into a single virtual method, resolving the TODO in `op_kernel.h`. #### Background The `OpKernel` class previously had two virtual methods for consuming shared pre-packed weight buffers: - **`UseSharedPrePackedBuffers`** (V1) — 3 params: `prepacked_buffers`, `input_idx`, `used_shared_buffers` - **`UseSharedPrePackedBuffers_V2`** — 4 params: added `prepacked_buffer_sizes` (a `gsl::span<const size_t>`) V2 was introduced to pass buffer sizes alongside the buffers. Its default implementation forwarded to V1 for backward compatibility. The framework (`session_state.cc`) only ever called V2. #### Changes Merged both methods into a single `UseSharedPrePackedBuffers` using the V2 signature: ```cpp virtual Status UseSharedPrePackedBuffers(std::vector<BufferUniquePtr>& prepacked_buffers, gsl::span<const size_t> prepacked_buffer_sizes, int input_idx, /*out*/ bool& used_shared_buffers); ``` Updated **27 files** across the codebase: | Category | Files | Change | |----------|-------|--------| | Base class | `op_kernel.h` | Removed V1 + V2; single 4-param method | | Framework | `session_state.cc` | Renamed `_V2` call | | Plugin EP bridge | `ep_kernel_registration.cc` | Renamed override | | QMoECPU | `moe_quantization_cpu.h/.cc` | Renamed V2 override + template instantiations | | CPU provider (8 kernels) | `gemm`, `matmul`, `conv_transpose`, `fp16_conv`, `qlinearconv`, `matmul_integer_base`, `deep_cpu_lstm`, `deep_cpu_gru` | Added `prepacked_buffer_sizes` param | | ACL provider (2 kernels) | `acl/conv`, `acl/matmul` | Added param | | Contrib ops (4 kernels) | `matmul_nbits`, `dynamic_quantize_lstm`, `attention_quant`, `bert/attention` | Added param | | Tests | `session_state_test.cc` | Updated test kernel override | #### Notes - Existing V1 overrides add the new `prepacked_buffer_sizes` parameter as **unnamed/unused** (`/*prepacked_buffer_sizes*/`) — no logic changes in those kernels. - The C API (`SetSharedPrePackedWeight` in `onnxruntime_ep_c_api.h`) already passes buffer sizes, so **no C API changes** were needed. - Private helper functions (e.g., `UseSharedPrePackedBuffersImpl` in LSTM/GRU) are not virtual overrides and were **not modified**. ### Motivation and Context Addresses the TODO at `include/onnxruntime/core/framework/op_kernel.h:139`: > TODO: Consolidate UseSharedPrePackedBuffers and UseSharedPrePackedBuffers_V2 into a single function, which will require updating kernel-based provider-bridge EPs (cpu, cuda, webgpu).

References

#27924 - Cleanup: Consolidate `OpKernel::UseSharePrePackedBuffers_V2` and `OpKernel::UseSharePrePackedBuffers`

Author

adrianlizarraga

Parents

5b2c0da3

onnxruntime a6592fc0 - Cleanup: Consolidate `OpKernel::UseSharePrePackedBuffers_V2` and `OpKernel::UseSharePrePackedBuffers` (#27924)

onnxruntime
a6592fc0 - Cleanup: Consolidate `OpKernel::UseSharePrePackedBuffers_V2` and `OpKernel::UseSharePrePackedBuffers` (#27924)