[Clang][HIP][CUDA] Add `__cluster_dims__` and `__no_cluster__` attribute (#156686)
This PR adds basic frontend support for `__cluster_dims__` and
`__no_cluster__` attribute.
In CUDA/HIP programming, the ``__cluster_dims__`` attribute can be
applied to a kernel function to set the dimensions of a thread block
cluster. The ``__no_cluster__`` attribute can be applied to a kernel
function to indicate that the thread block cluster feature will not be
enabled at both compile time and kernel launch time. Note that
`__no_cluster__` is a LLVM/Clang only attribute.
Co-authored-by: Yaxun (Sam) Liu <yaxun.liu@amd.com>
Co-authored-by: Jay Foad <jay.foad@amd.com>