onnxruntime
5160c67a - [webgpu] Add Matmul8bits Support (#24546)

Commit
293 days ago
[webgpu] Add Matmul8bits Support (#24546) ### Description This PR adds the support for 8-bit quantization in the `MatMulNBits` operation in WebGPU. It does below things: 1. Unify to use `MatMulNBitsProgram` as the fallback path which is the original generation path for block size = 32. Now make it support any blocks size without limitations. And remove the original complicated programs. 2. Enable `MatMulNBitsWideTileProgram` for all platforms.
Author
Parents
Loading