onnxruntime
[webgpu] Add Matmul8bits Support
#24546
Merged

[webgpu] Add Matmul8bits Support #24546

sushraja-msft merged 22 commits into main from matmul8bits
qjia7
qjia7 [webgpu] Support nbits = 8 for matmulnbits
3bca9e8a
qjia7 add 8bits support for non-dp4a path
eee685d3
qjia7 add 8bits support for subgroup matrix path
48fc5c28
qjia7 add nbits=8 support for MatMulNBitsWideTileProgram
502bd2cd
qjia7 temporarily disable dp4 path for nbits = 8
3a5d892f
qjia7 add zero points for MatMulNBitsBlockWiseProgram
450a5052
qjia7 support any components of A for MatMulNBitsBlockWiseProgram
a5fa8ed3
qjia7 remove all limitations of MatMulNBitsBlockWiseProgram
f3b2a4aa
qjia7 Fix bugs in zero_points when nbits = 4
3a77a7ea
qjia7 remove unused code
95aedbf4
qjia7 Merge branch 'main' into matmul8bits
6077d228
qjia7 enable tests for 8bits
f6c583bd
qjia7 use flatten workgroup_idx
57457421
qjia7 add todo
5e3e602b
qjia7 qjia7 marked this pull request as ready for review 1 year ago
qjia7 qjia7 requested a review from sushraja-msft sushraja-msft 1 year ago
qjia7 qjia7 requested a review from guschmue guschmue 1 year ago
guschmue guschmue added ep:WebGPU
sushanthr
sushanthr commented on 2025-04-28
sushanthr
sushanthr commented on 2025-04-28
sushanthr
sushanthr commented on 2025-04-28
sushanthr
sushanthr commented on 2025-04-28
sushanthr
sushanthr commented on 2025-04-28
qjia7 fix the dp4 path overflow issue
186a79cc
qjia7 qjia7 marked this pull request as draft 1 year ago
qjia7 address comments
98cc62d9
qjia7
qjia7 address comments
643b64aa
qjia7 Merge branch 'main' into matmul8bits
5a396289
qjia7 rename MatMulNBitsBlockWiseProgram to MatMulNBitsProgram
b078fd28
qjia7 refactor ReadZeroPoint
1da9f83a
qjia7 qjia7 marked this pull request as ready for review 1 year ago
sushraja-msft
sushraja-msft approved these changes on 2025-05-02
sushraja-msft
sushraja-msft dismissed these changes on 2025-05-02
sushraja-msft
sushraja-msft commented on 2025-05-05
qjia7 address comments
1a378af7
qjia7 qjia7 dismissed their stale review via 1a378af7 1 year ago
qjia7 qjia7 requested a review from sushraja-msft sushraja-msft 1 year ago
qjia7 fix the warning
07181a60
guschmue
guschmue approved these changes on 2025-05-06
sushraja-msft sushraja-msft merged 5160c67a into main 1 year ago
sushraja-msft sushraja-msft deleted the matmul8bits branch 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone