Increase some tolerances for tf32 for Conv3d tests (#60451)
Summary:
Allow those tests to pass on A100 GPUs which support tf32
Basically follow-up to https://github.com/pytorch/pytorch/pull/52871 which also increased some precisions to 0.05
For reference these are the failures I see (only ones in testnn with 1.9.0):
```
FAIL: test_Conv3d_pad_same_cuda_tf32 (__main__.TestNN)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1033, in wrapper
method(*args, **kwargs)
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1033, in wrapper
method(*args, **kwargs)
File "test_nn.py", line 11296, in with_tf32_on
test.test_cuda(self, **kwargs)
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_nn.py", line 5103, in test_cuda
test_case.assertEqualIgnoreType(cpu_d_i, gpu_d_i, atol=self.precision, rtol=0)
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1254, in assertEqualIgnoreType
return self.assertEqual(*args, exact_dtype=False, **kwargs)
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1355, in assertEqual
super().assertTrue(result, msg=self._get_assert_msg(msg, debug_msg=debug_msg))
AssertionError: False is not true : Tensors failed to compare as equal!With rtol=0 and atol=0.005, found 161 element(s) (out of 288) whose difference(s) exceeded the margin of error (including 0 nan compariso
ns). The greatest difference was 0.032408137116391345 (-33.45570601919647 vs. -33.42329788208008), which occurred at index (2, 0, 0, 1, 0).
======================================================================
FAIL: test_Conv3d_pad_same_dilated_cuda_tf32 (__main__.TestNN)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1033, in wrapper
method(*args, **kwargs)
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1033, in wrapper
method(*args, **kwargs)
File "test_nn.py", line 11296, in with_tf32_on
test.test_cuda(self, **kwargs)
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_nn.py", line 5103, in test_cuda
test_case.assertEqualIgnoreType(cpu_d_i, gpu_d_i, atol=self.precision, rtol=0)
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1254, in assertEqualIgnoreType
return self.assertEqual(*args, exact_dtype=False, **kwargs)
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1355, in assertEqual
super().assertTrue(result, msg=self._get_assert_msg(msg, debug_msg=debug_msg))
AssertionError: False is not true : Tensors failed to compare as equal!With rtol=0 and atol=0.005, found 111 element(s) (out of 288) whose difference(s) exceeded the margin of error (including 0 nan compariso
ns). The greatest difference was 0.024654212557543076 (35.104286017977465 vs. 35.07963180541992), which occurred at index (3, 0, 0, 0, 2).
======================================================================
FAIL: test_Conv3d_pad_valid_cuda_tf32 (__main__.TestNN)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1033, in wrapper
method(*args, **kwargs)
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1033, in wrapper
method(*args, **kwargs)
File "test_nn.py", line 11296, in with_tf32_on
test.test_cuda(self, **kwargs)
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_nn.py", line 5103, in test_cuda
test_case.assertEqualIgnoreType(cpu_d_i, gpu_d_i, atol=self.precision, rtol=0)
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1254, in assertEqualIgnoreType
return self.assertEqual(*args, exact_dtype=False, **kwargs)
File "/tmp/easybuild-tmp/eb-ED4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)M3d/tmpqOhUjN/lib/python3.8/site-packages/torch/testing/_internal/common_utils.py", line 1355, in assertEqual
super().assertTrue(result, msg=self._get_assert_msg(msg, debug_msg=debug_msg))
AssertionError: False is not true : Tensors failed to compare as equal!With rtol=0 and atol=0.005, found 41 element(s) (out of 288) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 0.010903167642320355 (8.074376869119371 vs. 8.06347370147705), which occurred at index (0, 0, 1, 0, 0).
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60451
Reviewed By: albanD
Differential Revision: D29353255
Pulled By: ngimel
fbshipit-source-id: 155a02242be5a11dcbd9dd40ab63f15c6757ae1b