[HLSL][DirectX] Implement HLSL `mul` function and DXIL lowering of `llvm.matrix.multiply` (#184882)
Fixes #99138
- Defines a `__builtin_hlsl_mul` clang builtin in `Builtins.td`.
- Links the `__builtin_hlsl_mul` clang builtin with
`hlsl_alias_intrinsics.h` under the name `mul` for matrix cases
- Implement scalar and vector elementwise multiplication cases of the
`mul` function in `hlsl_intrinsics.h` and `hlsl_intrinsic_helpers.h`
- Adds sema for `__builtin_hlsl_mul` to `CheckBuiltinFunctionCall` in
`SemaHLSL.cpp`
- Adds codegen for `__builtin_hlsl_mul` to `EmitHLSLBuiltinExpr` in
`CGHLSLBuiltins.cpp`
- Vector-vector cases lower to `dot` (except double vectors, which
expands to scalar multiply-adds).
- Matrix-matrix, matrix-vector, and vector-matrix multiplication lower
to the `llvm.matrix.multiply` intrinsic
- Adds codegen tests to `clang/test/CodeGenHLSL/builtins/mul.hlsl`
- Adds sema tests to `clang/test/SemaHLSL/BuiltIns/mul-errors.hlsl`
- Implements lowering of the `llvm.matrix.multiply` intrinsic to DXIL in
`DXILIntrinsicExpansion.cpp`
Note: Currently the SPIRV backend does not support row-major matrix
memory layouts when lowering matrix multiply, and just assumes
column-major layout. Therefore this PR also makes the DirectX backend
only assume column-major layout. Implementing support for row-major
order shall be done in a separate PR. (Tracked by
https://github.com/llvm/llvm-project/issues/184906)
This PR locally passes the `mul` offload tests in both DirectX 12 and
Vulkan: https://github.com/llvm/offload-test-suite/pull/941
Assisted-by: claude-opus-4.6