onnxruntime
94e34ace - Bugfix for SimplifiedLayerNormalization (#12975)

Commit

3 years ago

Bugfix for SimplifiedLayerNormalization (#12975) This PR is to fix https://github.com/microsoft/onnxruntime/issues/12930 and https://github.com/microsoft/onnxruntime/issues/12579. In detail: - For CPU EP, since current impl of SimplifiedLayerNormalization doesn't support input and scale having different data types, so if the sub-graph contains Cast Op, the sub-graph will not fused, this guarantee that both inputs and output data type will be same - For CUDA EP, add (fp16, float) support to (T,V) type constraints all combinations of fp16 and float can be supported in the impl With the fix, the original model can be run with SimplifiedLayerNormalization, which also helps to improve the perf.

References

#12975 - Bugfix for SimplifiedLayerNormalization

Author

Lafi7e

Parents

237ccc01

onnxruntime 94e34ace - Bugfix for SimplifiedLayerNormalization (#12975)

onnxruntime
94e34ace - Bugfix for SimplifiedLayerNormalization (#12975)