onnxruntime
DQ→MatMulNBits fusion transformer for NvTensorRtRtx ep
#27466

Merged

DQ→MatMulNBits fusion transformer for NvTensorRtRtx ep #27466

tianleiwu merged 8 commits into microsoft:main from anujj:MatMulNBits-trasformation-nv-trt-rtx

optimizer: fuse WebNN DQ chain back to MatMulNBits for NvTensorRTRTX

7e665906

optimizer(webnn): keep DQ->MatMulNBits fusion on phase-1 safe path

0434a1f4

optimizer(webnn): restore Gemm bias preserving DQ to MatMulNBits fusi…

e38ae4fa

Add SDXL Turbo fusion pattern for QDQ+MatMul -> MatMulNBits fusion

e4332496

optimizer(webnn): harden direct DQ->MatMulNBits fusion shape checks

887b6251

file name changes

f98418ff

xadupre commented on 2026-02-26

xadupre commented on 2026-02-27

refactor: split DQMatMulNBitsFusion::ApplyImpl into focused helpers

8559bcd8

github-advanced-security commented on 2026-03-02

tianleiwu added release:1.24.3

anujj force pushed from fbce2c02 to 210a7c41 90 days ago

anujj force pushed from 210a7c41 to 143f7e04 90 days ago

address PR review: config-driven gating, SafeInt, minimal-build guard…

40207a90

anujj force pushed from 143f7e04 to 40207a90 90 days ago

tianleiwu approved these changes on 2026-03-04

tianleiwu enabled auto-merge (squash) 90 days ago

tianleiwu merged 5c3f5449 into main 90 days ago

Reviewers

tianleiwu

xadupre

Assignees

No one assigned

Labels

release:1.24.3

Milestone

No milestone