onnxruntime
b76994dc - Improve CUDA EP's GetCapability (#17809)

Commit
2 years ago
Improve CUDA EP's GetCapability (#17809) Improve CUDA EP's GetCapability: Add layout transformer support. Currently the code detects if a node is already assigned to some EP, if yes, it will directly return. ```c++ if (!node.GetExecutionProviderType().empty()) { return; } ``` So, if you call the GetCapability function twice, ```c++ auto caps = GetCapability(); assign_nodes_to_eps(..., caps, ...); auto caps2 = GetCapability(); ``` The second GetCapability() call will return fewer results than the first one. Layout transformer needs to call GetCapability twice as above. So the current GetCapability() implementation is incompatible with the Layout transformer. It is not an issue right now because the CUDA EP doesn't need to do layout transform. But we might want to support a different layout.
Author
Changming Sun
Parents
Loading