[WebGPU] QKV and MLP layer fusions for Qwen3-style models #28280
Initial commit
7c0c1b92
Merge remote-tracking branch 'origin' into hari/webgpu_perf_1
c42879d1
More changes
a0550b68
Merge branch 'hari/webgpu_perf_1' of https://github.com/microsoft/onn…
c55adfed
Stage
ee09d8e4
More changes
aa357eef
Stage
318b26be
Worka nd good perf
ad53b3d9
Skip + MatmulNBitsSilu fusion - works and good perf
b67ae811
Cleanup
01671d9c
hariharans29
changed the title [DO NOT REVIEW]: Title-TODO [DO NOT REVIEW]: TODO 13 days ago
Move back to workgroup/tile_size default
30485ddf
Merge main
27317b8a
Merge remote-tracking branch 'origin' into hari/webgpu_perf_1
a56fb564
Copilot comments + Fix builds + Fix lint + Fusion diagrams
13bf9793
Fix test
d1090c86
Fix builds
ffacd4c5
Fixes
92874ce7
Slim PR: drop benchmark harness, lazy buffer-mgr fix, consteval fix, …
a7899c6a
hariharans29
changed the title [DO NOT REVIEW]: TODO [WebGPU]: QKV and MLP fusions for Qwen3 11 days ago
Remove unused dp4a_matmul_mlp.wgsl.template
2039c7f2
Cleanup: drop unused empty namespace + env_var_utils include in graph…
a02cf125
Merge remote-tracking branch 'origin' into hari/webgpu_perf_1
beb1709e
hariharans29
changed the title [WebGPU]: QKV and MLP fusions for Qwen3 [WebGPU] QKV and MLP fusions for Qwen3 11 days ago
Copilot comments
90650634
Fixes
4ac9c816
Fix
306fba37
Use fresh WebGPU EP per session in fusion-vs-unfused tests
6c8c7a35
qjia7
commented
on 2026-05-09
qjia7
commented
on 2026-05-09
Remove unused file
a90a049c
qjia7
commented
on 2026-05-11
[WebGPU] Extract shared LayerNorm/SkipLayerNorm program runners
007a78e7
[WebGPU] MatMulNBitsMlp: adopt shared norm helpers + activation enum
37db5b87
[WebGPU] MatMulNBitsMlpFusion: match fused-QuickGelu MLP shape
2c1a2a36
[WebGPU/JSEP] Enable QuickGeluFusion for WebGPU and JSEP EPs
234bcf44
Copilot comments
eaa6635c
Merge main and resolve conflicts
106c07ef
hariharans29
changed the title [WebGPU] QKV and MLP fusions for Qwen3 [WebGPU] QKV and MLP fusions for Qwen3-style models 15 hours ago
hariharans29
changed the title [WebGPU] QKV and MLP fusions for Qwen3-style models [WebGPU] QKV and MLP layer fusions for Qwen3-style models 15 hours ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub