[CIR][X86] Add support for shuff32x4/shufi32x4 builtins (#172960)
This implementation is adapted from the existing code for
`X86::BI__builtin_ia32_shuf_i*` and `X86::BI__builtin_ia32_shuf_f*` from
`/llvm-project/clang/lib/CodeGen/TargetBuiltins/X86.cpp`.
It adds support for the following X86 builtins:
- __builtin_ia32_shuf_f32x4
- __builtin_ia32_shuf_f64x2
- __builtin_ia32_shuf_i32x4
- __builtin_ia32_shuf_i64x2
- __builtin_ia32_shuf_f32x4_256
- __builtin_ia32_shuf_f64x2_256
- __builtin_ia32_shuf_i32x4_256
- __builtin_ia32_shuf_i64x2_256
Part of https://github.com/llvm/llvm-project/issues/167765