llvm-project
363903eb - [CIR][AArch64] Add lowering for unpredicated svdup builtins (#174433)

Commit
99 days ago
[CIR][AArch64] Add lowering for unpredicated svdup builtins (#174433) This PR adds CIR lowering support for unpredicated `svdup` SVE builtins. The corresponding ACLE intrinsics are documented at: * https://developer.arm.com/architectures/instruction-sets/intrinsics (search for svdup). Since LLVM provides a direct intrinsic for svdup with a 1:1 mapping, CIR lowers these builtins by emitting a call to the corresponding LLVM intrinsic. DESIGN NOTES ------------ With this change, ACLE intrinsics that have a corresponding LLVM intrinsic can generally be lowered by CIR by reusing LLVM intrinsic metadata, avoiding duplicated intrinsic-name definitions, unless codegen-relevant SVETypeFlags are involved. As a consequence, CIR may no longer emit NYI diagnostics for intrinsics that (a) have a known LLVM intrinsic mapping and (b) do not use such codegen-relevant `SVETypeFlag`s; these intrinsics are lowered directly. IMPLEMENTATION NOTES -------------------- * Intrinsic discovery logic mirrors the approach in CodeGen/TargetBuiltins/ARM.cpp, but is simplified since CIR only requires the intrinsic name. * Test inputs are copied from the existing svdup tests: tests/CodeGen/AArch64/sve-intrinsics/acle_sve_dup.c. * The LLVM IR produced _with_ and _without_ `-fclangir` is identical, modulo basic block labels, SROA, and function attributes. EXAMPLE LOWERING ---------------- Input: ```C svint8_t test_svdup_n_s8(int8_t op) { return svdup_n_s8(op); } ``` OUTPUT 1 (default): ```llvm define dso_local <vscale x 16 x i8> @test_svdup_n_s8(i8 noundef %op) #0 { entry: %op.addr = alloca i8, align 1 store i8 %op, ptr %op.addr, align 1 %0 = load i8, ptr %op.addr, align 1 %1 = call <vscale x 16 x i8> @llvm.aarch64.sve.dup.x.nxv16i8(i8 %0) ret <vscale x 16 x i8> %1 } ``` OUTPUT 2 (via `-fclangir`): ```llvm define dso_local <vscale x 16 x i8> @test_svdup_n_s8(i8 %0) #0 { %2 = alloca i8, i64 1, align 1 %3 = alloca <vscale x 16 x i8>, i64 1, align 16 store i8 %0, ptr %2, align 1 %4 = load i8, ptr %2, align 1 %5 = call <vscale x 16 x i8> @llvm.aarch64.sve.dup.x.nxv16i8(i8 %4) store <vscale x 16 x i8> %5, ptr %3, align 16 %6 = load <vscale x 16 x i8>, ptr %3, align 16 ret <vscale x 16 x i8> %6 } ```
Author
Parents
Loading