Enable softmax and tiny norm FP16 tests on ROCm (#46363)
Summary:
This pull request enables the following tests on ROCm:
* TestCuda.test_tiny_half_norm_
* TestNNDeviceTypeCUDA.test_softmax_cuda_float16
* TestNNDeviceTypeCUDA.test_softmax_cuda_float32
* TestNNDeviceTypeCUDA.test_softmax_results_cuda_float16
* TestNNDeviceTypeCUDA.test_softmax_results_cuda_float32
The earlier failures, because of which the tests were skipped, were because of a precision issue for FP16 compute on MI25 hardware with ROCm 3.7 and older. The fix was delivered in the compiler in ROCm 3.8.
The pull request fixes https://github.com/pytorch/pytorch/issues/37493
cc: jeffdaily ezyang malfet mruberry
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46363
Reviewed By: heitorschueroff
Differential Revision: D24325639
Pulled By: ezyang
fbshipit-source-id: a7dbb238cf38c04b6592baad40b4d71725a358c9