[ROCm] Hip headers fix (#3532)
* Add cg headers hipification
* Exclude including cuda_bf16.h on ROCm
* Merge
* Retricting including cuda_bf16.h with BF16_AVAILABLE var
---------
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>