onnxruntime
01c5d057 - Avoid repeated GemmSoftmaxGemmPermuteTunableOp<HipT> ctor invocation (#16518)

Commit
2 years ago
Avoid repeated GemmSoftmaxGemmPermuteTunableOp<HipT> ctor invocation (#16518) The `GemmSoftmaxGemmPermuteTunableOp<HipT>` is expensive to construct, avoid the ctor invocation will substantially improve the launch time and get better performance during the decoding. This get <7% e2e time reduction of whisper large.
Author
Parents
Loading