pytorch
01de5dda - add mixed data type support for LayerNorm backward on CPU (#88064)

Commit
1 year ago
add mixed data type support for LayerNorm backward on CPU (#88064) ### Motivation Amp provides convenience methods for mixed precision. If users use amp to run bfloat16 models, torch.autocast will keep module parameters in acc dtype which will leave gamma and beta in float while input/output will be in bfloat16. The same goes for backward: parameters are in float, and X & dX & dY are in bfloat16. Mixed data type support for LayerNorm backward is also needed for model training with LayerNorm. ### Testing Single socket (icx, 32cores): | shape | fp32 forward (ms) | bf16 forward (ms) | mix forward (ms) | fp32 backward (ms) | bf16 backward (ms) | mix backward (ms) | | -- | -- | -- | -- | -- | -- | -- | | (1, 8, 16) | 0.012 | 0.012 | 0.012 | 0.071 | 0.065 | 0.062 | | (8, 8, 16) | 0.015 | 0.014 | 0.015 | 0.074 | 0.070 | 0.063 | | (32, 8, 16) | 0.062 | 0.016 | 0.016 | 0.073 | 0.073 | 0.072 | | (64, 128, 56, 56) | 2.467 | 0.907 | 0.0897 | 12.993 | 7.603 | 7.777 | | (64, 128, 256, 256) | 48.904 | 25.589 | 25.472 | 343.992 | 183.133 | 188.222 | Single core(icx): | shape | fp32 forward (ms) | bf16 forward (ms) | mix forward (ms) | fp32 backward (ms) | bf16 backward (ms) | mix backward (ms) | | -- | -- | -- | -- | -- | -- | -- | | (1, 8, 16) | 0.012 | 0.012 | 0.012 | 0.050 | 0.050 | 0.050 | | (8, 8, 16) | 0.014 | 0.014 | 0.014 | 0.052 | 0.054 | 0.053 | | (32, 8, 16) | 0.034 | 0.019 | 0.018 | 0.059 | 0.067 | 0.066 | | (64, 128, 56, 56) | 66.791| 17.725 | 19.799 | 119.431 | 106.123 | 107.446 | | (64, 128, 256, 256) | 1542.477 | 402.132 | 527.044 | 3019.437 | 2336.318 | 2448.320 | Pull Request resolved: https://github.com/pytorch/pytorch/pull/88064 Approved by: https://github.com/jgong5, https://github.com/malfet
Author
Committer
Parents
Loading