Update cuda_to_hip_mappings.py (#122110)
Added one datatype mapping (cuda_bf16.h), and a number of cub/hipcub mappings. Note: the missing mappings were discovered when hipifying the Mamba model's (https://github.com/state-spaces/mamba) forward kernel.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122110
Approved by: https://github.com/jithunnair-amd, https://github.com/Skylion007