onnxruntime
90c263f4 - Add API to compile a model (#24207)

Commit
254 days ago
Add API to compile a model (#24207) ### Description - Adds C/C++ API functionality to compile a model (i.e., generate a model with EPContext nodes) using explicit APIs. - Adds support for compiling when input or output models are in memory (not just files). - Allows specifying the threshold for when initializers are stored in an external file. - Allows file paths of arbitrary lengths (session_option key/value configs limited string length to 2048). List of C API functions: ```C++ ORT_API(const OrtCompileApi*, GetCompileApi); ORT_API(void, ReleaseModelCompilationOptions, _Frees_ptr_opt_ OrtModelCompilationOptions*); ORT_API2_STATUS(CreateModelCompilationOptionsFromSessionOptions, _In_ const OrtEnv* env, _In_ const OrtSessionOptions* session_options, _Outptr_ OrtModelCompilationOptions** out); ORT_API2_STATUS(ModelCompilationOptions_SetInputModelPath, _In_ OrtModelCompilationOptions* model_compile_options, _In_ const ORTCHAR_T* input_model_path); ORT_API2_STATUS(ModelCompilationOptions_SetInputModelFromBuffer, _In_ OrtModelCompilationOptions* model_compile_options, _In_ const void* input_model_data, size_t input_model_data_size); ORT_API2_STATUS(ModelCompilationOptions_SetOutputModelPath, _In_ OrtModelCompilationOptions* model_compile_options, _In_ const ORTCHAR_T* output_model_path); ORT_API2_STATUS(ModelCompilationOptions_SetOutputModelExternalInitializersFile, _In_ OrtModelCompilationOptions* model_compile_options, _In_ const ORTCHAR_T* external_initializers_file_path, size_t external_initializer_size_threshold); ORT_API2_STATUS(ModelCompilationOptions_SetOutputModelBuffer, _In_ OrtModelCompilationOptions* model_compile_options, _Inout_ OrtAllocator* allocator, void** output_model_buffer_ptr, size_t* output_model_buffer_size_ptr); ORT_API2_STATUS(ModelCompilationOptions_SetEpContextEmbedMode, _In_ OrtModelCompilationOptions* model_compile_options, bool embed_ep_context_in_model); ORT_API2_STATUS(CompileModel, _In_ const OrtEnv* env, _In_ const OrtModelCompilationOptions* model_options); ``` Example (see unit tests for others): ```C++ #include "onnxruntime_cxx_api.h" // Test using the CompileModel() API with settings: // - input model from buffer // - output model file // - EPContext nodes in output model use embedded binary blobs. TEST_F(QnnHTPBackendTests, CompileApi_FromSessionOptions_InputModelAsBuffer_Embedded) { const ORTCHAR_T* output_model_file = ORT_TSTR("./qnn_context_binary_multi_partition_test.onnx"); std::filesystem::remove(output_model_file); // Initialize session options with QNN EP Ort::SessionOptions session_options; ProviderOptions provider_options; #if defined(_WIN32) provider_options["backend_path"] = "QnnHtp.dll"; #else provider_options["backend_path"] = "libQnnHtp.so"; #endif provider_options["offload_graph_io_quantization"] = "0"; session_options.AppendExecutionProvider("QNN", provider_options); // Create model compilation options from the session options. Ort::ModelCompilationOptions compile_options(*ort_env, session_options); compile_options.SetInputModelFromBuffer(reinterpret_cast<const void*>(model_data.data()), model_data.size()); compile_options.SetOutputModelPath(output_model_file); compile_options.SetEpContextEmbedMode(true); // Compile the model. Ort::Status status = Ort::CompileModel(*ort_env, compile_options); ASSERT_TRUE(status.IsOK()); // Make sure the compiled model was generated and has the expected number of EPContext nodes. ASSERT_TRUE(std::filesystem::exists(output_model_file)); CheckEpContextNodeCounts(output_model_file, 2, 2); } ``` ### Motivation and Context Improve compilation workflow and add new capabilities. --------- Co-authored-by: Scott McKay <skottmckay@gmail.com>
Parents
Loading