[CUDA][HIP] make trivial ctor/dtor host device (#72394)
Make trivial ctor/dtor implicitly host device functions so that they can
be used to initialize file-scope
device variables to match nvcc behavior.
Fixes: https://github.com/llvm/llvm-project/issues/72261
Fixes: SWDEV-432412