Fix bug on the backpropagation of LayerNorm when create_graph=True (#41595)
Summary:
Solve an issue https://github.com/pytorch/pytorch/issues/41332
I found the bug at https://github.com/pytorch/pytorch/issues/41332 is caused by LayerNorm.
Current implementations of LayerNorm have a disparity between
1. [`create_graph=False` CUDA implementation](https://github.com/BIT-silence/pytorch/blob/dde3d5f4a8f713ecc4649d776565b68ca75ae5c8/aten/src/ATen/native/cuda/layer_norm_kernel.cu#L145)
2. [`create_graph=True` implementation](https://github.com/BIT-silence/pytorch/blob/dde3d5f4a8f713ecc4649d776565b68ca75ae5c8/tools/autograd/templates/Functions.cpp#L2536)
With this bug-fix, https://github.com/pytorch/pytorch/issues/41332 is solved.
Ailing BIT-silence
Signed-off-by: Vinnam Kim <vinnamkim@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41595
Reviewed By: houseroad
Differential Revision: D22598415
Pulled By: BIT-silence
fbshipit-source-id: 63e390724bd935dc8e028b4dfb75d34a80558c3a