pytorch
45f77189 - fix the invalid configuration argument error when running layer norm backward (#80893)

Commit

2 years ago

fix the invalid configuration argument error when running layer norm backward (#80893) Summary: Fix the corner case with N = 0 Test Plan: buck run mode/opt //deeplearning/fbgemm/fbgemm_gpu/fb:layer_norm_test 2>&1 | tee out.log Before this Diff ``` test_swish_layer_norm (fbgemm_gpu.test.layer_norm_test.SparseOpsTest) ... INFO:2022-07-05 09:00:32 738347:738347 CuptiActivityProfiler.cpp:166] CUDA versions. CUPTI: 14; Runtime: 11040; Driver: 11040 Falsifying example: test_swish_layer_norm( self=<fbgemm_gpu.test.layer_norm_test.SparseOpsTest testMethod=test_swish_layer_norm>, M=1, N=0, dtype=torch.float32, device='cuda', epsilon=0.1, ) ERROR ====================================================================== ERROR: test_swish_layer_norm (fbgemm_gpu.test.layer_norm_test.SparseOpsTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/users/jianyuhuang/fbsource/fbcode/buck-out/opt/gen/aab7ed39/deeplearning/fbgemm/fbgemm_gpu/fb/layer_norm_test#binary,link-tree/fbgemm_gpu/test/layer_norm_test.py", line 41, in test_swish_layer_norm M=st.integers(0, 32), File "/data/users/jianyuhuang/fbsource/fbcode/buck-out/opt/gen/aab7ed39/deeplearning/fbgemm/fbgemm_gpu/fb/layer_norm_test#binary,link-tree/hypothesis/core.py", line 1164, in wrapped_test raise the_error_hypothesis_found File "/data/users/jianyuhuang/fbsource/fbcode/buck-out/opt/gen/aab7ed39/deeplearning/fbgemm/fbgemm_gpu/fb/layer_norm_test#binary,link-tree/fbgemm_gpu/test/layer_norm_test.py", line 88, in test_swish_layer_norm Y_ref.backward(grad_output, retain_graph=True) File "/data/users/jianyuhuang/fbsource/fbcode/buck-out/opt/gen/aab7ed39/deeplearning/fbgemm/fbgemm_gpu/fb/layer_norm_test#binary,link-tree/torch/_tensor.py", line 401, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/data/users/jianyuhuang/fbsource/fbcode/buck-out/opt/gen/aab7ed39/deeplearning/fbgemm/fbgemm_gpu/fb/layer_norm_test#binary,link-tree/torch/autograd/__init__.py", line 191, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: CUDA error: invalid configuration argument CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. ---------------------------------------------------------------------- Ran 1 test in 3.578s FAILED (errors=1) ``` Differential Revision: D37618022 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80893 Approved by: https://github.com/ngimel

Author

jianyuh

Committer

pytorchmergebot

Parents

1ad7ef3f

pytorch 45f77189 - fix the invalid configuration argument error when running layer norm backward (#80893)

pytorch
45f77189 - fix the invalid configuration argument error when running layer norm backward (#80893)