onnxruntime
5160c67a - [webgpu] Add Matmul8bits Support (#24546)

Commit

344 days ago

[webgpu] Add Matmul8bits Support (#24546) ### Description This PR adds the support for 8-bit quantization in the `MatMulNBits` operation in WebGPU. It does below things: 1. Unify to use `MatMulNBitsProgram` as the fallback path which is the original generation path for block size = 32. Now make it support any blocks size without limitations. And remove the original complicated programs. 2. Enable `MatMulNBitsWideTileProgram` for all platforms.

References

#24546 - [webgpu] Add Matmul8bits Support

Author

qjia7

Parents

1f4156c6

onnxruntime 5160c67a - [webgpu] Add Matmul8bits Support (#24546)

onnxruntime
5160c67a - [webgpu] Add Matmul8bits Support (#24546)