Revert tl.int1 casting change for ROCm to avoid hangs (#110531)
Seeing hangs on ROCm seemingly after this PR https://github.com/pytorch/pytorch/pull/110388
https://ossci-raw-job-status.s3.amazonaws.com/log/17381916785
`inductor/test_torchinductor_opinfo.py::TestInductorOpInfoCUDA::test_comprehensive_exp2_cuda_bool Command took >30min, returning 124`
Conditionalising out of this while we investigate.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110531
Approved by: https://github.com/peterbell10