CUDA BFloat16 and other improvements on abs (#44804)
Summary:
Not sure if ROCm supports `std::abs` today, let's see the CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44804
Reviewed By: mruberry
Differential Revision: D23748837
Pulled By: ngimel
fbshipit-source-id: ccf4e63279f3e5927a85d8d8f70ba4b8c334156b