onnxruntime
65cf878e - WebGPU QuantizeLinear: add per-axis support and int8 fixes

Commit
32 days ago
WebGPU QuantizeLinear: add per-axis support and int8 fixes - Fix clamp range: use type-dependent constants (-128..127 for int8, 0..255 for uint8) instead of hardcoded (0, 255) - Fix zero-point unpacking: use unpack4xI8 for signed types - Add per-axis quantization support in WGSL shader and C++ host code - Register QuantizeLinear kernels for opsets 13-18, 19-20, and 21 - Add int8 tests with exact-division scales to avoid GPU FP precision issues - Exclude 3 existing Int8 tests from WebGPU EP due to FP division precision (same as DML)
Author
Parents
Loading