onnxruntime
94e34ace - Bugfix for SimplifiedLayerNormalization (#12975)

Commit
3 years ago
Bugfix for SimplifiedLayerNormalization (#12975) This PR is to fix https://github.com/microsoft/onnxruntime/issues/12930 and https://github.com/microsoft/onnxruntime/issues/12579. In detail: - For CPU EP, since current impl of SimplifiedLayerNormalization doesn't support input and scale having different data types, so if the sub-graph contains Cast Op, the sub-graph will not fused, this guarantee that both inputs and output data type will be same - For CUDA EP, add (fp16, float) support to (T,V) type constraints all combinations of fp16 and float can be supported in the impl With the fix, the original model can be run with SimplifiedLayerNormalization, which also helps to improve the perf.
Author
Parents
Loading