onnxruntime
02a0be35 - Optimize Transpose around QLinearSoftmax (#22849)

Commit

1 year ago

Optimize Transpose around QLinearSoftmax (#22849) ### Description  - Improved Transpose around QLinearSoftmax in Level 3 NHWC Transformer. - Removed redundant code HandleQLinearConcat, HandleQLinearBinaryOp. ### Motivation and Context  By merging and eliminating redundant transpose , the Image Segmentation i8 model (MobileNetv2 + DeepLabv3) achieves a 2.34X speedup.

References

#22849 - Optimize Transpose around QLinearSoftmax

Author

yihonglyu

Parents

135d8b2b

onnxruntime 02a0be35 - Optimize Transpose around QLinearSoftmax (#22849)

onnxruntime
02a0be35 - Optimize Transpose around QLinearSoftmax (#22849)