onnxruntime
Optimize FastGelu with float2 and float4 vectorized kernels on ROCm
#11491
Merged

Loading