onnxruntime
c1719194 - Fix NeonFp16DequantB8Bit reference to match kernel fp16 precision (#27812)

Commit

128 days ago

Fix NeonFp16DequantB8Bit reference to match kernel fp16 precision (#27812) The kernel computes neg_scaled_zp = -(scale * zp) in fp16 first (intermediate rounding), then uses it in the fma. For scale*zp in range [128, 256), the fp16 ULP is 0.125, so this intermediate rounding error (~0.06) propagates to the result and exceeded the test tolerance. The reference was computing everything in fp32 and converting to fp16 only at the end, avoiding this intermediate rounding. This caused mismatches up to 0.15 (29 fp16 ULPs). Fix: emulate the kernel's fp16 computation order in the reference: 1. neg_szp = MLAS_FP16(-(scale * zp)).ToFloat() // fp16 round-trip 2. result = MLAS_FP16(neg_szp + value * scale) // emulates fma

References

#27812 - Fix NeonFp16DequantB8Bit reference to match kernel fp16 precision

Author

jambayk

Parents

36c962fd

onnxruntime c1719194 - Fix NeonFp16DequantB8Bit reference to match kernel fp16 precision (#27812)

onnxruntime
c1719194 - Fix NeonFp16DequantB8Bit reference to match kernel fp16 precision (#27812)