onnxruntime
7b2e3674 - [webgpu] Optimize DP4AMatMulNBitsSmallMProgram for intel (#25192)

Commit
176 days ago
[webgpu] Optimize DP4AMatMulNBitsSmallMProgram for intel (#25192) ### Description This PR optimizes the Intel GPU path for the `DP4AMatMulNBitsSmallMProgram` by tuning `tile_size` and `tile_size_k_vec`. ### Motivation and Context With this change, we achieved >8% performance boost on Intel iGPUs (Xe-LP and Xe2-LPG) for phi-4-mini-accuracy4 model.
Author
Parents
Loading