onnxruntime
02a0be35 - Optimize Transpose around QLinearSoftmax (#22849)

Commit
1 year ago
Optimize Transpose around QLinearSoftmax (#22849) ### Description <!-- Describe your changes. --> - Improved Transpose around QLinearSoftmax in Level 3 NHWC Transformer. - Removed redundant code HandleQLinearConcat, HandleQLinearBinaryOp. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> By merging and eliminating redundant transpose , the Image Segmentation i8 model (MobileNetv2 + DeepLabv3) achieves a 2.34X speedup.
Author
Parents
Loading