onnxruntime
65cf878e - WebGPU QuantizeLinear: add per-axis support and int8 fixes

Commit

32 days ago

WebGPU QuantizeLinear: add per-axis support and int8 fixes - Fix clamp range: use type-dependent constants (-128..127 for int8, 0..255 for uint8) instead of hardcoded (0, 255) - Fix zero-point unpacking: use unpack4xI8 for signed types - Add per-axis quantization support in WGSL shader and C++ host code - Register QuantizeLinear kernels for opsets 13-18, 19-20, and 21 - Add int8 tests with exact-division scales to avoid GPU FP precision issues - Exclude 3 existing Int8 tests from WebGPU EP due to FP division precision (same as DML)

Author

edgchen1

Parents

39bf86a2

onnxruntime 65cf878e - WebGPU QuantizeLinear: add per-axis support and int8 fixes

onnxruntime
65cf878e - WebGPU QuantizeLinear: add per-axis support and int8 fixes