llvm-project
5a3fdecb - [X86] Add i256/i512 CTPOP expansion on AVX512VPOPCNTDQ targets (#182830)

Commit
61 days ago
[X86] Add i256/i512 CTPOP expansion on AVX512VPOPCNTDQ targets (#182830) If we can freely fold the i256/i512 value to the FPU, then we can use VPOPCNTQ to perform a per-element CTPOP, then perform an expanded VECREDUCE_ADD (VPMOVQB v4i64/v8i64 to v16i8 with zero uppers - then VPSADBW to sum the lower v8i8 bits). Fixes #182829
Author
Parents
Loading