onnxruntime
5c3f5449 - DQ→MatMulNBits fusion transformer for NvTensorRtRtx ep (#27466)

Commit
6 days ago
DQ→MatMulNBits fusion transformer for NvTensorRtRtx ep (#27466) ## Summary Generalize the WebNN-specific DequantizeLinear → MatMulNBits graph fusion transformer so it can be reused by other execution providers (e.g. NvTensorRTRTX), and add defensive shape/size validation to prevent crashes on malformed tensors. ### Fusion patterns **Pattern 1:** `DequantizeLinear → Reshape → Transpose → [Cast] → MatMul/Gemm` → **MatMulNBits** **Pattern 2:** `DequantizeLinear (axis=0) → MatMul/Gemm` → **MatMulNBits** --------- Co-authored-by: praneshgo <227579474+praneshgo@users.noreply.github.com>
Author
Parents
Loading