onnxruntime
318a2516 - Fix run-level profiling for subgraph operators (#27870)

Commit
16 days ago
Fix run-level profiling for subgraph operators (#27870) ### Description Run-level profiling (introduced in PR #26846) does not currently capture profiling events for operators inside subgraphs. This PR fixes that by threading the `run_profiler` pointer through `OpKernelContextInternal` to subgraph execution, following the same pattern as `terminate_flag`. ### Root Cause `utils::ExecuteSubgraph()` had no `run_profiler` parameter and always passed `nullptr` to `ExecuteGraphImpl`, so nested operators (inside If, Loop, Scan, BeamSearch, GreedySearch) were never profiled at the run level. ### Fix 1. **`OpKernelContextInternal`** — Added `run_profiler_` member and `GetRunProfiler()` accessor. 2. **`SessionScope` / `ExecuteKernel()`** — Pass the run profiler into `OpKernelContextInternal`. 3. **`ExecuteSubgraph()`** — Added `profiling::Profiler* run_profiler = nullptr` parameter, forwarded to `ExecuteGraphImpl()`. 4. **Control flow ops** (`if.cc`, `loop.cc`, `scan_utils.cc`) — Pass `context_.GetRunProfiler()` to `ExecuteSubgraph()`. 5. **Contrib transformer ops** (`beam_search_impl_gpt.h`, `beam_search_impl_t5.h`, `beam_search_impl_whisper.h`, `greedy_search_impl_gpt.h`) — All 8 `ExecuteSubgraph()` call sites updated to pass `this->context_.GetRunProfiler()`. Plugin EP control flow kernels (`PluginEpIfKernelImpl`, etc.) delegate to the same internal kernels, so the fix propagates automatically. ### Tests - **`CheckRunProfilerWithSubgraph`** (`inference_session_test.cc`) — Runs `if_mul.onnx`, enables run profiling, asserts `mul_0` (inside If's then-branch) appears in the profile JSON. - **`CheckRunProfilerWithBeamSearch`** (`beam_search_test.cc`) — Runs `tiny_gpt2_beamsearch.onnx`, enables run profiling, asserts decoder subgraph Node entries (beyond the top-level BeamSearch op) appear in the profile JSON. ### Files Changed (12 files) | File | Change | |------|--------| | `core/framework/op_kernel_context_internal.h` | Added `run_profiler_` member, `GetRunProfiler()`, constructor param | | `core/framework/sequential_executor.cc` | `SessionScope::GetRunProfiler()`, pass to `OpKernelContextInternal` | | `core/framework/utils.h` / `utils.cc` | `run_profiler` param on `ExecuteSubgraph()` | | `core/providers/cpu/controlflow/if.cc` | Forward `GetRunProfiler()` | | `core/providers/cpu/controlflow/loop.cc` | Forward `GetRunProfiler()` | | `core/providers/cpu/controlflow/scan_utils.cc` | Forward `GetRunProfiler()` | | `contrib_ops/cpu/transformers/beam_search_impl_gpt.h` | 2 call sites | | `contrib_ops/cpu/transformers/beam_search_impl_t5.h` | 2 call sites | | `contrib_ops/cpu/transformers/beam_search_impl_whisper.h` | 2 call sites | | `contrib_ops/cpu/transformers/greedy_search_impl_gpt.h` | 2 call sites | | `test/framework/inference_session_test.cc` | `CheckRunProfilerWithSubgraph` test | | `test/contrib_ops/beam_search_test.cc` | `CheckRunProfilerWithBeamSearch` test |
Parents
Loading