Enable `BFloat16` for `nan_to_num` on CUDA (#58063)
Summary:
Enabled BFloat16 for `nan_to_num` on CUDA. For comparison with numpy, a [workaround suggested](https://github.com/pytorch/pytorch/issues/57982#issuecomment-839150556) by ngimel is being used - the OpInfo's `sample.kwargs` is used to set two `numpy.kwargs`, viz. `posinf` & `neginf` for `BFloat16`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58063
Reviewed By: mruberry
Differential Revision: D28373478
Pulled By: ngimel
fbshipit-source-id: 6493b560d83632a8519c1d3bfc5c54be9b935fb9