llvm-project
7ec6f177 - [AMDGPU] Cost of i8 vector insert/extract is free in some cases (#194991)

Commit
1 day ago
[AMDGPU] Cost of i8 vector insert/extract is free in some cases (#194991) Reduce the cost of i8 vector insert and extract elements to avoid scalarization in VectorCombine. It is impossible to know during VectorCombine if an extract element will require additional instructions or be free. There is a lot of additional context needed to make that assessment. For example, what instructions are using the extract elements or what other extract element index values occur. This patch chooses some cases that likely do not require instructions, which reduces the overall cost and avoids scalarization. Because of this chance, there are SLP vectorization opportunities that are missed. In general, those missed SLP vectorization cases require scalarization during code generation, and the compiler ends up generating the same code with and without SLP vectorization.
Author
Parents
Loading