pytorch
166d4e42 - Change `test_conv_large` parameter initialization (#71521)

Commit

2 years ago

Change `test_conv_large` parameter initialization (#71521) Summary: This PR twiddles the parameters of the conv layer in `test_conv_large` to better avoid NaN values. Previously, this test would cause a NaN to be computed for `scale` (propagated from `.mean()` on the `.grad` tensor). This NaN would then be propagated to the scaled gradients via division, resulting in a bogus `assertEqual` check as `NaN == NaN` is by default true. (This behavior was observed on V100 and A100). To improve visibility of failures in the event of NaNs in `grad1`, scale is now computed from `grad2`. Interestingly enough, we discovered this issue when trying out some less common setups that broke this test; it turns out those breakages were cases where there were no NaN values (leading to an actual `assertEqual` check that would fail for `float16`). CC ptrblck ngimel puririshi98 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71521 Reviewed By: anjali411 Differential Revision: D33776705 Pulled By: ngimel fbshipit-source-id: a1ec4792cba04c6322b22ef5b80ce08579ea4cf6 (cherry picked from commit d207bd9b87f8e8c2cb13182b7295c17e19dc3dba)

References

#72894 - Merge pytorch master into lazy_tensor_staging

Author

eqy

Committer

pytorchmergebot

Parents

965b9f48

pytorch 166d4e42 - Change `test_conv_large` parameter initialization (#71521)

pytorch
166d4e42 - Change `test_conv_large` parameter initialization (#71521)