[webgpu] Apply dp4a for generation shader #24064
[webgpu] Apply dp4a for generation shader
d9430217
support any block_size % 32 = 0
356410a5
apply it only for float type
4cd3a3a1
Merge branch 'main' into matmulnbist_dp4a_gen
e116df85
use 1D dispatch group size
4631638e
Adjust the code to make it more flexible
5074d164
Use workgroup size = 128
d96de51c
Add more annotations
36db69d4
qjia7
marked this pull request as draft 1 year ago
fix error in scale_a
701acbd3
Extract common functions for code reuse
e538dd57
qjia7
marked this pull request as ready for review 1 year ago
address comments
f3a93e74
address comments
f9ac9ab1
qjia7
dismissed their stale review
via f9ac9ab1
1 year ago
guschmue
approved these changes
on 2025-03-20
guschmue
merged
127c8503
into main 1 year ago
guschmue
deleted the matmulnbist_dp4a_gen branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub