[webgpu] Optimize MatMulNBits for f16 Block32 prefill performance #23908
daijh
force pushed
from
8a250dbf
to
74da2901
1 year ago
sushanthr
approved these changes
on 2025-03-25
sushanthr
approved these changes
on 2025-03-26
[webgpu] Optimize MatMulNBits for f16 Block32 prefill performance
1be49b21
Resolve comments
4ead0043
Fix variable naming
14bbe9df
Add comment on f32 accumulator
0f55827f
Improve comment
695d9d05
More comment and avoid magic number
fd751cbd
Improve variable naming
58d76f62
Add tile_m and tile_n into constructor
ae482a2a
Rename to MatMulNBitsWideTileProgram
287be7e3
Improve comment to reflect new naming
ca1710a9
daijh
force pushed
from
4d3801f3
to
ca1710a9
346 days ago
Fix lint
17c0b1f3
guschmue
dismissed these changes
on 2025-04-02
Prefer onnxruntime::narrow
e7f8bb43
guschmue
dismissed their stale review
343 days ago
guschmue
approved these changes
on 2025-04-04
guschmue
merged
3dfc2ae3
into main 343 days ago
daijh
deleted the matmul-f16-block32-prefill branch 343 days ago
Assignees
No one assigned