[AMDGPU] Add optimization for llvm.amdgcn.wave.shuffle in uniform cases (#174795)
When the llvm.amdgcn.wave.shuffle intrinsic is called with a uniform
Index operand, it is effectively the same as the llvm.amdgcn.readlane
intrinsic. This change handles this situation and replaces it with the
readlane intrinsic
---------
Signed-off-by: Domenic Nutile <domenic.nutile@gmail.com>