onnxruntime
[WebGPU] QKV and MLP layer fusions for Qwen3-style models
#28280
Open

[WebGPU] QKV and MLP layer fusions for Qwen3-style models #28280

hariharans29 wants to merge 32 commits into main from hari/webgpu_perf_1
hariharans29
hariharans29 Initial commit
7c0c1b92
hariharans29 Merge remote-tracking branch 'origin' into hari/webgpu_perf_1
c42879d1
hariharans29 More changes
a0550b68
hariharans29 Merge branch 'hari/webgpu_perf_1' of https://github.com/microsoft/onn…
c55adfed
hariharans29 Stage
ee09d8e4
hariharans29 More changes
aa357eef
hariharans29 Stage
318b26be
hariharans29 Worka nd good perf
ad53b3d9
hariharans29 Skip + MatmulNBitsSilu fusion - works and good perf
b67ae811
hariharans29 Cleanup
01671d9c
hariharans29 hariharans29 changed the title [DO NOT REVIEW]: Title-TODO [DO NOT REVIEW]: TODO 13 days ago
hariharans29 Move back to workgroup/tile_size default
30485ddf
hariharans29 Merge main
27317b8a
hariharans29 hariharans29 requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 13 days ago
github-actions
github-actions commented on 2026-04-30
github-advanced-security
github-advanced-security commented on 2026-04-30
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2026-04-30
hariharans29 Merge remote-tracking branch 'origin' into hari/webgpu_perf_1
a56fb564
hariharans29 Copilot comments + Fix builds + Fix lint + Fusion diagrams
13bf9793
hariharans29 Fix test
d1090c86
hariharans29 Fix builds
ffacd4c5
hariharans29 Fixes
92874ce7
hariharans29 Slim PR: drop benchmark harness, lazy buffer-mgr fix, consteval fix, …
a7899c6a
hariharans29 hariharans29 changed the title [DO NOT REVIEW]: TODO [WebGPU]: QKV and MLP fusions for Qwen3 11 days ago
hariharans29 Remove unused dp4a_matmul_mlp.wgsl.template
2039c7f2
hariharans29 Cleanup: drop unused empty namespace + env_var_utils include in graph…
a02cf125
hariharans29 Merge remote-tracking branch 'origin' into hari/webgpu_perf_1
beb1709e
hariharans29 hariharans29 requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 11 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2026-05-02
hariharans29 hariharans29 changed the title [WebGPU]: QKV and MLP fusions for Qwen3 [WebGPU] QKV and MLP fusions for Qwen3 11 days ago
hariharans29 Copilot comments
90650634
hariharans29 Fixes
4ac9c816
hariharans29 Fix
306fba37
hariharans29 Use fresh WebGPU EP per session in fusion-vs-unfused tests
6c8c7a35
guschmue guschmue added ep:WebGPU
qjia7
qjia7 commented on 2026-05-09
qjia7
qjia7 commented on 2026-05-09
hariharans29 Remove unused file
a90a049c
hariharans29
qjia7
qjia7
qjia7 commented on 2026-05-11
hariharans29
hariharans29
hariharans29
hariharans29 [WebGPU] Extract shared LayerNorm/SkipLayerNorm program runners
007a78e7
hariharans29 [WebGPU] MatMulNBitsMlp: adopt shared norm helpers + activation enum
37db5b87
hariharans29 [WebGPU] MatMulNBitsMlpFusion: match fused-QuickGelu MLP shape
2c1a2a36
hariharans29 [WebGPU/JSEP] Enable QuickGeluFusion for WebGPU and JSEP EPs
234bcf44
hariharans29
hariharans29 commented on 2026-05-11
hariharans29 hariharans29 requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 1 day ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2026-05-11
hariharans29 Copilot comments
eaa6635c
hariharans29 Merge main and resolve conflicts
106c07ef
hariharans29 hariharans29 changed the title [WebGPU] QKV and MLP fusions for Qwen3 [WebGPU] QKV and MLP fusions for Qwen3-style models 15 hours ago
hariharans29 hariharans29 changed the title [WebGPU] QKV and MLP fusions for Qwen3-style models [WebGPU] QKV and MLP layer fusions for Qwen3-style models 15 hours ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone