onnxruntime
99e10e68 - Add BF16 kernels in several ops for Gemma-3 (#26102)

Commit

126 days ago

Add BF16 kernels in several ops for Gemma-3 (#26102) ### Description This PR adds missing kernels for bfloat16 precision across several ops in both the `ai.onnx` and `com.microsoft` domains. 1. `SkipLayerNormalization` [contrib op](https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#commicrosoftskiplayernormalization) 2. `Conv` for [opset 22](https://onnx.ai/onnx/operators/onnx__Conv.html) 3. `Pow` for [opset 15](https://onnx.ai/onnx/operators/onnx__Pow.html) 4. `AveragePool` for [opset 22](https://onnx.ai/onnx/operators/onnx__AveragePool.html) This PR also enables weight-only quantization of a bfloat16 `MatMul` op to a bfloat16 `MatMulNBits` [contrib op](https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#commicrosoftmatmulnbits). ### Motivation and Context This PR enables running ONNX models from the Gemma-3 family that are generated with bfloat16 precision.

References

#26102 - Add BF16 kernels in several ops for Gemma-3

Author

kunal-vaishnavi

Parents

8c5d2b58

onnxruntime 99e10e68 - Add BF16 kernels in several ops for Gemma-3 (#26102)

onnxruntime
99e10e68 - Add BF16 kernels in several ops for Gemma-3 (#26102)