DeepSpeed
8e891aa5 - Transformer kernel/fix layer norm (#1587)

Commit
3 years ago
Transformer kernel/fix layer norm (#1587) * fixing the softmax masking when using triangular masking * fix a bug in the the layernorm backward kernels * revert back some changes & remove debug code * change the constants to a macro Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Parents
Loading