Resolve issue when running Yolov4 on DNNL EP (#9355)
The dnnl_binary ops need the memory format to match the format expected by
Onnxruntime. If the memory format of the inputs do not match each other
there will be an error in the calculated results.
Additionally, since the code manually pads the tensor dimensions for broadcasting
the inputs are expected to be in Onnxruntimes format.
Since detecting and reordering the memory to Ort format matches what was previously
done for the Reshape op the code was moved from dnnl_reshape to
dnnl_subgraph_primitive under the name GetMemoryInOrtFormat.
One small additional change made to the capability code log to also print the
percentage of nodes run by the dnnl execution provider.
Signed-off-by: George Nash <george.nash@intel.com>