CUDA BFloat16 support of clamp, remainder, lshift, rshift (#45247)
Summary:
Add CUDA BFloat16 support of clamp, remainder, lshift, rshift
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45247
Reviewed By: dzhulgakov
Differential Revision: D24174258
Pulled By: ngimel
fbshipit-source-id: bfcd2d1b3746bb0527d590533f3c38b9c4d0a638