llvm-project
6c6fb00c - [AMDGPU] Optimize S_OR_B32 to S_ADDK_I32 where possible (#177949)

Commit
76 days ago
[AMDGPU] Optimize S_OR_B32 to S_ADDK_I32 where possible (#177949) This PR fixes #177753, converting disjoint S_OR_B32 to S_ADDK_I32 whenever possible, it avoids this transformation in case S_OR_B32 can be converted to bitset. Note on Test Failures (Draft Status) This change causes significant register reshuffling across the test suite due to the new allocation hints and the swaps performed in case src0 is not a register and src1, along with the change from or to addk. To avoid a massive, noisy diff during the initial logic review: This Draft PR only includes a representative sample of updated tests. CodeGen/AMDGPU/combine-reg-or-const.ll -> Showcases change from S_OR to S_ADDK CodeGen/AMDGPU/s-barrier.ll -> Showcases swap between Src0 and Src1 if src0 is not a register The rest of the tests show the result of the register allocation hint we give, I have checked every test I updated and they seem ok to me. Once the core logic is approved, I will run the update script across the remaining ~70 failing tests and mark the PR as "Ready for Review."
Author
Parents
Loading