onnxruntime
SkipSimplifiedLayerNorm + QuickGelu bfloat16 CUDA implementation
#24772
Merged

Loading