Kernel Data Conversion Utility (#2327)
* Unify macro definitions and constants in a single file
* Conversion utility implementation.
* Fix reversion from formatting
* Bugfixes after testing with correct DeepSpeed
* Inline markers are available on both HIP + CUDA