llvm-project
c657a6f6 - [AMDGPU] Fix selection of s_load_b96 on GFX11 (#108029)

Commit
1 year ago
[AMDGPU] Fix selection of s_load_b96 on GFX11 (#108029) Fix a bug which resulted in selection of s_load_b96 on GFX11, which only exists in GFX12. The root cause was a mismatch between legalization and selection. The condition used to check that the load was uniform in legalization (SITargetLowering::LowerLOAD) was "!Op->isDivergent()". The condition used to detect a non-uniform load during selection (AMDGPUDAGToDAGISel::isUniformLoad()) was "N->isDivergent() && !AMDGPUInstrInfo::isUniformMMO(MMO)". This makes a difference when IR uniformity analysis has more information than SDAG's built in analysis. In the test case this is because IR UA reports that everything is uniform if isSingleLaneExecution() returns true, e.g. if the specified max flat workgroup size is 1, but SDAG does not have this optimization. The immediate fix is to use the same condition to detect uniform loads in legalization and selection. In future SDAG should learn about isSingleLaneExecution(), and then it could probably stop relying on IR metadata to detect uniform loads.
Author
Parents
Loading