onnxruntime
fd6bab42 - [js/webgpu] Provide a vectorized algorithm for GroupedConv (#18884)

Commit
2 years ago
[js/webgpu] Provide a vectorized algorithm for GroupedConv (#18884) ### Description This PR provides a vectorized algorithm for NHWC GroupedConv to improve performance. The aggregate time of GroupedConv in mobilenetv2-12 becomes ~1ms from ~4ms on Intel Alder Lake machine. About 20% improvement for the whole model.
Author
Parents
Loading