[AMDGPU][SIInsertWaitcnts][NFC] Simplify logic in GFX12Plus::applyPreexistingWaitcnts (#184925)
The loop is collecting the first instruction of each waitcnt kind and is
erasing the rest, with the exception of DEPCTR which needs more checks.
The existing code was factoring out the instruction deletion and the
setting of the collected instruction variables. But the special handling
for DEPCTR and the in-loop deletion of `S_WAITCNT_lds_direct` was just
complicating the logic.