PR #26102 Add BF16 kernels in several ops for Gemma-3

Add BF16 kernels in several ops for Gemma-3 #26102

kunal-vaishnavi merged 15 commits into microsoft:main from kunal-vaishnavi:kvaishnavi/gemma3-bf16

Remove duplicate op counting

9916a246

8c3f997a

Fix warnings during build

980f9d8c

Add Conv-22 with bfloat16

5c7151c3

Add Pow-15 with bfloat16

b4d030a9

Add AveragePool-22 with bfloat16

27242b7c

Add weight-only quantization for MatMulNBits with bfloat16

92d3524d

nenad1002 commented on 2025-09-23

tianleiwu commented on 2025-09-23

Add end versions to op registrations

daa3fc77

Add unit tests

69cb2153

Return early if BF16 is not supported

7d6119a0

Remove BF16 CUDA NHWC kernels

82349667

Merge branch 'main' into kvaishnavi/gemma3-bf16

7d6cee3d

Fix build break with BF16 Conv test

fc9b62b4

Increase threshold for BF16 CUDA SkipLayerNorm test

cc483622

Add failing ONNX tests to list

3d1d2df7

tianleiwu commented on 2025-09-25

tianleiwu approved these changes on 2025-09-25

kunal-vaishnavi merged 99e10e68 into main 98 days ago

Reviewers

tianleiwu

nenad1002

Assignees

No one assigned

Labels

None yet

Milestone

No milestone