onnxruntime
9ca299d0 - Implement CUDA EP Plugin profiling API (#28216)

Commit
16 days ago
Implement CUDA EP Plugin profiling API (#28216) This pull request adds support for CUPTI-based GPU profiling to the CUDA plugin execution provider (EP) in ONNX Runtime. Profiling is now available in the plugin EP when built with the `onnxruntime_ENABLE_CUDA_PROFILING` CMake flag, enabling detailed GPU activity tracing and integration with ORT's profiling system. The implementation introduces a new `CudaPluginEpProfiler` that bridges between ORT's profiling API and CUPTI, and updates the build system, plugin interface, and documentation accordingly. **CUDA Plugin Profiling Integration:** * Added a new `CudaPluginEpProfiler` class (`cuda_profiler_plugin.h/.cc`) that implements the `OrtEpProfilerImpl` interface, delegates to a `CUPTIManager` singleton for GPU activity tracing, and provides callbacks for profiling lifecycle and event correlation. [[1]](diffhunk://#diff-1f42eda0693594c09576d132854290df0f39e439d450c79f50e01f9969d0af2dR1-R43) [[2]](diffhunk://#diff-1dccd750352acaba880066f09b8d8a042d13fae7b3dd5bc103f0ab43685ae2deR1-R148) * Updated the plugin EP interface in `cuda_ep.h`/`cuda_ep.cc` to conditionally provide a `CreateProfilerImpl` callback when profiling is enabled, wiring up the new profiler implementation. [[1]](diffhunk://#diff-82888350617a2e54bb30b1a11cd2563ecaf2b45ed0baba736674d9156c912b20R95-R99) [[2]](diffhunk://#diff-0890d267a71ca02f4173c2ab226e6c5707fcbbf6bbb5f602fa5d92aa82f42a80R137-R143) [[3]](diffhunk://#diff-0890d267a71ca02f4173c2ab226e6c5707fcbbf6bbb5f602fa5d92aa82f42a80R661-R678) * Modified the CMake build (`onnxruntime_providers_cuda_plugin.cmake`) to conditionally link against `CUDA::cupti` and define the necessary compile-time flags for profiling support. **Documentation Updates:** * Expanded the design documentation (`cuda_plugin_ep_design.md`) to describe the profiling and observability architecture, CUPTI integration, correlation ID flow, event collection, and differences from the in-tree CUDA EP profiler. Build configuration and relevant source files are also documented. **Miscellaneous:** * Included the new profiler header in the plugin EP implementation. * Minor test and import adjustments (e.g., `test_cuda_plugin_ep.py`). These changes enable the CUDA plugin EP to participate fully in ORT's profiling system, allowing users to observe GPU kernel and memory activity in conjunction with CPU-side events when profiling is enabled.
Author
Parents
Loading