onnxruntime
405dcd72 - Add SkipLayerNorm fusion with bias Add (#27765)

Commit
46 days ago
Add SkipLayerNorm fusion with bias Add (#27765) ### Description This pull request introduces a new graph optimization pass to fuse Add + SkipLayerNormalization subgraphs into a single SkipLayerNormalization node that incorporates a bias input. This helps simplify the computation graph, especially for models using bias after MatMul, and extends support for more execution providers. The main changes include the implementation of the new fusion, its integration into the optimizer pipeline, and updates to provider compatibility. **New Bias + SkipLayerNormalization Fusion:** * Added a new `BiasSkipLayerNormFusion` class and implementation to detect and fuse subgraphs where a 1D bias is added to a MatMul (optionally through a Cast) before SkipLayerNormalization, replacing them with a single node that absorbs the bias as a fifth input. **Integration into Optimization Pipeline:** * Registered the new `BiasSkipLayerNormFusion` in the graph transformer utility, ensuring it runs after the standard SkipLayerNorm fusion and covers more execution providers (CPU, ACL, CUDA, DML, JS, WebGPU). **Test and Include Updates:** * Updated test and implementation files to include the new fusion header where relevant. ### Motivation and Context These changes collectively improve model optimization by reducing node count and improving runtime efficiency for supported providers. This PR also helps perform this fusion on many models inside the Foundry Local catalog without needing to re-deploy models. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Parents
Loading