onnxruntime
bd11ab68 - Optimize LayernormGrad (#4156)

Commit
5 years ago
Optimize LayernormGrad (#4156) * Draft for LayerNorm Optimization * Modify LayernormGrad kernel based on new backward graph. * keep two LayernormGrad implementations. One is implemented based on input X, mean. The other is based on output Y, scale, bias. The first one is enabled by default. The second one can be enabled by --use_invertible_layernorm_grad * expose use_invertible_layernorm_grad to frontend. * add fp16 tests. Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Weixing Zhang <wezhan@microsoft.com>
Author
Parents
Loading