onnxruntime
Add BF16 kernels in several ops for Gemma-3
#26102
Merged

Add BF16 kernels in several ops for Gemma-3 #26102

kunal-vaishnavi
kunal-vaishnavi Remove duplicate op counting
9916a246
kunal-vaishnavi Register SkipLayerNorm for bfloat16
8c3f997a
kunal-vaishnavi Fix warnings during build
980f9d8c
kunal-vaishnavi Add Conv-22 with bfloat16
5c7151c3
kunal-vaishnavi Add Pow-15 with bfloat16
b4d030a9
kunal-vaishnavi Add AveragePool-22 with bfloat16
27242b7c
kunal-vaishnavi Add weight-only quantization for MatMulNBits with bfloat16
92d3524d
nenad1002
nenad1002 commented on 2025-09-23
tianleiwu
tianleiwu commented on 2025-09-23
kunal-vaishnavi Add end versions to op registrations
daa3fc77
kunal-vaishnavi Add unit tests
69cb2153
kunal-vaishnavi Return early if BF16 is not supported
7d6119a0
kunal-vaishnavi Remove BF16 CUDA NHWC kernels
82349667
kunal-vaishnavi Merge branch 'main' into kvaishnavi/gemma3-bf16
7d6cee3d
kunal-vaishnavi Fix build break with BF16 Conv test
fc9b62b4
kunal-vaishnavi Increase threshold for BF16 CUDA SkipLayerNorm test
cc483622
kunal-vaishnavi Add failing ONNX tests to list
3d1d2df7
tianleiwu
tianleiwu commented on 2025-09-25
tianleiwu
tianleiwu approved these changes on 2025-09-25
kunal-vaishnavi kunal-vaishnavi merged 99e10e68 into main 98 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone