onnxruntime
8de1639a
- [webgpu] Enable DP4A MatMul generation path for Qualcomm (#24408)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
280 days ago
[webgpu] Enable DP4A MatMul generation path for Qualcomm (#24408) With this PR, the generation speed for phi4 improves 2x on Qualcomm Adreno X1 GPU (11.1 tps -> 23.2 tps for simple inputs).
References
#24408 - [webgpu] Enable DP4A MatMul generation path for Qualcomm
Author
qjia7
Parents
1f14dac5
Loading