onnxruntime
87165b92 - [js/webgpu] optimize MatmulNBits (#21747)

Commit
1 year ago
[js/webgpu] optimize MatmulNBits (#21747) ### Description <!-- Describe your changes. --> See 2x speedup for phi3 on the integrated intel gpu with this optimization. The optimization is mainly to store input A's data into local variable instead of loading them from global memory each time when calculate them with B data. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
Author
Parents
Loading