onnxruntime
66c9f1c0 - Make sure TRT EPs can loads models when initializers in memory (#26721)

Commit

94 days ago

Make sure TRT EPs can loads models when initializers in memory (#26721) This PR moves the conversion of initializers in-memory from Graph constructor to early in graph transform before the partitioning. This is done to avoid conversion when subgraphs are constructed. It also addresses bugs in TRT and NV TRT providers. Addresses issue: https://github.com/microsoft/onnxruntime/issues/26653 **Graph Initializer Conversion and Handling:** * Added a new method `Graph::ConvertInitializersIntoOrtValues()` to convert all graph TensorProto initializers into OrtValues and create in-memory external data references, separating this logic from graph construction and making it reusable. (`include/onnxruntime/core/graph/graph.h`, `onnxruntime/core/graph/graph.cc`) [[1]](diffhunk://#diff-aaea1507ec81a94c72a1fa72ce320df712156b665f7798573be3f7e439bb4c37R1457-R1463) [[2]](diffhunk://#diff-e231a92b40d89409cc8e82436be0a15bc87ef95c93b303b9feaeab6e50c8835cR3416-R3447) * Removed the previous lambda for converting large tensor initializers within the graph constructor, delegating this responsibility to the new method above for clearer separation of concerns. (`onnxruntime/core/graph/graph.cc`) [[1]](diffhunk://#diff-e231a92b40d89409cc8e82436be0a15bc87ef95c93b303b9feaeab6e50c8835cL1234-L1255) [[2]](diffhunk://#diff-e231a92b40d89409cc8e82436be0a15bc87ef95c93b303b9feaeab6e50c8835cL1275-L1276) [[3]](diffhunk://#diff-e231a92b40d89409cc8e82436be0a15bc87ef95c93b303b9feaeab6e50c8835cL1353-R1327) **Provider Interface Enhancements:** * Introduced move assignment operators for `GraphProto` and `TensorProto` in both the provider interface (`ProviderHost`) and wrapper structs, allowing for more efficient object transfers and assignment. (`onnxruntime/core/providers/shared_library/provider_interfaces.h`, `onnxruntime/core/providers/shared_library/provider_wrappedtypes.h`) [[1]](diffhunk://#diff-d62681d5e83139cfbc272f32afc4ff897dbfd84a709f02a932666e18240fa094L442-R457) [[2]](diffhunk://#diff-d62681d5e83139cfbc272f32afc4ff897dbfd84a709f02a932666e18240fa094L495-R511) [[3]](diffhunk://#diff-bf62a34e53927025e7a7bcf7f294532a366ec4ee069bbe541fcdc87e3b1eaa8fL178-R179) [[4]](diffhunk://#diff-bf62a34e53927025e7a7bcf7f294532a366ec4ee069bbe541fcdc87e3b1eaa8fL244-R248) * Added iterator interfaces (`TensorProto_ConstIterator`, `TensorProto_Iterator`) and corresponding methods to `TensorProtos` for clean iteration over initializer lists, improving code readability and maintainability. (`onnxruntime/core/providers/shared_library/provider_interfaces.h`, `onnxruntime/core/providers/shared_library/provider_wrappedtypes.h`) [[1]](diffhunk://#diff-d62681d5e83139cfbc272f32afc4ff897dbfd84a709f02a932666e18240fa094L73-R93) [[2]](diffhunk://#diff-d62681d5e83139cfbc272f32afc4ff897dbfd84a709f02a932666e18240fa094L524-R545) [[3]](diffhunk://#diff-bf62a34e53927025e7a7bcf7f294532a366ec4ee069bbe541fcdc87e3b1eaa8fL286-R295) **Execution Provider Logic Simplification:** * Refactored how initializers are processed in the NVExecutionProvider, using the new initializer conversion and iteration logic to simplify handling of external and in-memory data, and ensuring correct assignment and ownership of user-provided weights. (`onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc`) [[1]](diffhunk://#diff-b7114b8cae911bdd2c3523a09019f9a9b9f9d7cce4fdd50b282603c81a6137aaL1657-R1658) [[2]](diffhunk://#diff-b7114b8cae911bdd2c3523a09019f9a9b9f9d7cce4fdd50b282603c81a6137aaR1709-R1733) [[3]](diffhunk://#diff-b7114b8cae911bdd2c3523a09019f9a9b9f9d7cce4fdd50b282603c81a6137aaR2558-R2587) **Other Minor Improvements:** * Improved const-correctness and interface consistency for size and iterator methods in `TensorProtos`. (`onnxruntime/core/providers/shared_library/provider_interfaces.h`, `onnxruntime/core/providers/shared_library/provider_wrappedtypes.h`) [[1]](diffhunk://#diff-d62681d5e83139cfbc272f32afc4ff897dbfd84a709f02a932666e18240fa094L524-R545) [[2]](diffhunk://#diff-bf62a34e53927025e7a7bcf7f294532a366ec4ee069bbe541fcdc87e3b1eaa8fL286-R295)

References

#26721 - Make sure TRT EPs can loads models when initializers in memory

Author

yuslepukhin

Parents

838cf03c

onnxruntime 66c9f1c0 - Make sure TRT EPs can loads models when initializers in memory (#26721)

onnxruntime
66c9f1c0 - Make sure TRT EPs can loads models when initializers in memory (#26721)