[CUDA][HIP] Fix CTAD for host/device constructors (#168711)
Clang currently does not allow using CTAD in CUDA/HIP device functions
since deduction guides are treated as host-only. This patch fixes that
by treating deduction guides as host+device. The rationale is that
deduction guides do not actually generate code in IR, and there is an
existing check for device/host correctness for constructors.
The patch also suppresses duplicate implicit deduction guides from
host/device constructors with identical signatures and constraints
to prevent ambiguity.
For CUDA/HIP, deduction guides are now always implicitly enabled for
both host and device, which matches nvcc's effective behavior. Unlike
nvcc, which silently ignores explicit CUDA/HIP target attributes on
deduction guides, Clang diagnoses such attributes as errors to keep
the syntax clean and avoid confusion.
This ensures CTAD works correctly in CUDA/HIP for constructors with
different target attributes and provides clearer diagnostics when users
attempt to annotate deduction guides with CUDA/HIP target attributes.
Example:
```
#include <tuple>
__host__ __device__ void func()
{
std::tuple<int, int> t = std::tuple(1, 1);
}
```
This compiles with nvcc but fails with clang for CUDA/HIP without this
fix.
Reference: https://godbolt.org/z/WhT1GrhWE
Fixes: https://github.com/ROCm/ROCm/issues/5646
Fixes: https://github.com/llvm/llvm-project/issues/146646