onnxruntime
de17d53c - Custom Op runtime wrapper (#13427)

Commit
3 years ago
Custom Op runtime wrapper (#13427) ### Description Adds the below C APIs to support custom ops that wrap an entire model to be inferenced with an external runtime. The current SNPE EP is an example of an EP that could be ported to use a custom op wrapper. Ex: The custom op stores the serialized SNPE DLC binary as a string attribute. The SNPE model is built when the kernel is created. The model is inferenced with SNPE APIs on call to the kernel's compute method. #### C APIs | API | Description | Why | | --- | --- | --- | | `KernelInfo_GetInputCount` | Gets number of inputs from `OrtKernelInfo`. | Query I/O characteristics during kernel creation<sup>1</sup> | | `KernelInfo_GetOutputCount` | Gets number of outputs from `OrtKernelInfo`. | Query I/O characteristics during kernel creation<sup>1</sup> | | `KernelInfo_GetInputName` | Gets an input's name. | Query I/O characteristics during kernel creation<sup>1</sup> | | `KernelInfo_GetOutputName` | Gets an output's name. | Query I/O characteristics during kernel creation<sup>1</sup> | | `KernelInfo_GetInputTypeInfo` | Gets the type/shape information for an input. | Query I/O characteristics during kernel creation<sup>1</sup> | | `KernelInfo_GetOutputTypeInfo` | Gets the type/shape information for an output. | Query I/O characteristics during kernel creation<sup>1</sup> | | `KernelInfoGetAttribute_tensor` | Get a OrtValue tensor stored as an attribute in the graph node | Extract serialized models, weights, etc. | | `GetSessionConfigEntry` | Get a session configuration value | Need to be able to get session-time configurations from within custom op | | `HasSessionConfigEntry` | Check if session configuration entry exists. | Need to be able to get session-time configurations from within custom op | #### Why so many KernelInfo APIs?<sup>1</sup> Similar APIs currently exist for `OrtKernelContext`, but not `OrtKernelInfo`. Note that `OrtKernelContext` is passed to the custom op on call to its kernel's compute() function. However, `OrtKernelInfo` is available on kernel creation, which occurs when the session is created. Having these APIs available from `OrtKernelInfo` allows an operator to trade-off computation time for session-creation time, and vice versa. Operators that must build expensive state may prefer to do it during session creation time instead of compute-time. SNPE is an example of an EP that needs to be able to query `KernelInfo` for the name, type, and shape of inputs and outputs in order to build the model from the serialized DLC data. This is an expensive operation. Other providers (e.g., OpenVINO) are able to query i/o info from the serialized model, so they do not strictly need these APIs. However, the APIs can still be used to validate the expected I/O characteristics. Additionally, several of our CPU contrib ops currently use the same internal version of these KernelInfo APIs (Ex: [qlinear_softmax](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/contrib_ops/cpu/quantization/qlinear_softmax.cc#L71)). If custom ops are also meant to be a test bed for future ops, then all custom ops (not just runtime wrappers) would benefit from the addition of these public KernelInfo APIs (IMO). #### Example of usage in a custom OP From `onnxruntime/test/testdata/custom_op_openvino_wrapper_library/openvino_wrapper.h` ```c++ struct CustomOpOpenVINO : Ort::CustomOpBase<CustomOpOpenVINO, KernelOpenVINO> { explicit CustomOpOpenVINO(Ort::ConstSessionOptions session_options); CustomOpOpenVINO(const CustomOpOpenVINO&) = delete; CustomOpOpenVINO& operator=(const CustomOpOpenVINO&) = delete; void* CreateKernel(const OrtApi& api, const OrtKernelInfo* info) const; constexpr const char* GetName() const noexcept { return "OpenVINO_Wrapper"; } constexpr const char* GetExecutionProviderType() const noexcept { return "CPUExecutionProvider"; } // IMPORTANT: In order to wrap a generic runtime-specific model, the custom operator // must have a non-homogeneous variadic input and output. constexpr size_t GetInputTypeCount() const noexcept { return 1; } constexpr size_t GetOutputTypeCount() const noexcept { return 1; } constexpr ONNXTensorElementDataType GetInputType(size_t /* index */) const noexcept { return ONNX_TENSOR_ELEMENT_DATA_TYPE_UNDEFINED; } constexpr ONNXTensorElementDataType GetOutputType(size_t /* index */) const noexcept { return ONNX_TENSOR_ELEMENT_DATA_TYPE_UNDEFINED; } constexpr OrtCustomOpInputOutputCharacteristic GetInputCharacteristic(size_t /* index */) const noexcept { return INPUT_OUTPUT_VARIADIC; } constexpr OrtCustomOpInputOutputCharacteristic GetOutputCharacteristic(size_t /* index */) const noexcept { return INPUT_OUTPUT_VARIADIC; } constexpr bool GetVariadicInputHomogeneity() const noexcept { return false; // heterogenous } constexpr bool GetVariadicOutputHomogeneity() const noexcept { return false; // heterogeneous } std::vector<std::string> GetSessionConfigKeys() const { return {"device_type"}; } private: std::unordered_map<std::string, std::string> session_configs_; }; ``` #### How to create a session: ```c++ Ort::Env env; Ort::SessionOptions session_opts; Ort::CustomOpConfigs custom_op_configs; // Create local session config entries for the custom op. custom_op_configs.AddConfig("OpenVINO_Wrapper", "device_type", "CPU"); // Register custom op library and pass in the custom op configs (optional). session_opts.RegisterCustomOpsLibrary(lib_name, custom_op_configs); Ort::Session session(env, model_path.data(), session_opts); ``` ### Motivation and Context Allows creation of simple "wrapper" EPs outside of the main ORT code base.
Parents
Loading