[webgpu] Optimize matmulnbits with M > 1 #23102
[webgpu] Optimize matmulnbits with M > 1
d30cf803
Remove MatMulNBitsProgramPrefill
a349ad4f
remove components_a limitation
a7a7d9b7
make tile_m as class member
be81377e
merge MatMulNBitsWithLargeMProgram to MatMulNBitsProgram
d6277ea1
set tile M threshold
ca8ef7ab
guschmue
approved these changes
on 2024-12-17
guschmue
merged
0981bbf4
into main 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub