AMDGPU: Try constant fold after folding immediate (#141862)
This helps avoid some regressions in a future patch. The or 0
pattern appears in the division tests because the reduce 64-bit
bit operation to a 32-bit one with half identity value is only
implemented for constants. We could fix that by using computeKnownBits.
Additionally the pattern disappears if I optimize the IR division
expansion, so that IR should probably be emitted more optimally in
the first place.