[AMDGPU] SIPeepholeSDWA: Handle V_CNDMASK_B32_e64 (#137930)
The VOP3 form of the V_CNDMASK_B32 instruction takes a carry-in
operand. The conversion to SDWA implies a conversion to VOP2 form
which reads from VCC instead.
Convert V_CNDMASK_B32_e64 instructions that might be converted to SDWA
to V_CNDMASK_B32_e32 first and introduce a copy of the carry-in operand
to VCC.
Closes #133431.
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>