onnxruntime
c1719194 - Fix NeonFp16DequantB8Bit reference to match kernel fp16 precision (#27812)

Commit
36 days ago
Fix NeonFp16DequantB8Bit reference to match kernel fp16 precision (#27812) The kernel computes neg_scaled_zp = -(scale * zp) in fp16 first (intermediate rounding), then uses it in the fma. For scale*zp in range [128, 256), the fp16 ULP is 0.125, so this intermediate rounding error (~0.06) propagates to the result and exceeded the test tolerance. The reference was computing everything in fp32 and converting to fp16 only at the end, avoiding this intermediate rounding. This caused mismatches up to 0.15 (29 fp16 ULPs). Fix: emulate the kernel's fp16 computation order in the reference: 1. neg_szp = MLAS_FP16(-(scale * zp)).ToFloat() // fp16 round-trip 2. result = MLAS_FP16(neg_szp + value * scale) // emulates fma
Author
Parents
Loading