onnxruntime
4ac72e30 - NHWC Resize optimization (#11825)

Commit
3 years ago
NHWC Resize optimization (#11825) The optimization consists of: * Use int32_t instead of int64_t * Use different code path for tf_crop_and_resize or other coordinate_transformation_mode to avoid redundant conditions * Loop-invariant code motion of offset, coefficient and extrapolation_value check * Use fixed point to avoid floating-point computation Besides, it always transforms NCHW Resize to NHWC because it has higher perf in the NHWC variant when the input X is 4D int8/uint8 tensor and the mode is linear on ARM. It improves DeepLab V3 with int8 quantization by 26%~27% on big core and 37% on LITTLE core on AArch64. It also improves DeepLab V3 with uint8 quantization by 24%~25% on big core and 34% on LITTLE core on AArch64. Co-authored-by: Yufeng Li liyufeng1987@gmail.com
Author
Parents
Loading