[AMDGPU][Waitcnts] Don't create a pending flat event for LDS DMA (#170263)
Flat instructions need a waitcnt(0) on both VMEM and LDS accesses, but
only when the instruction really is using flat addressing. The LDS DMA
instructions (on GFX9) have the FLAT flag set, but they have very clear
semantics. These instructions update only VM_CNT (on GFX9), and hence do
not need to be treated like actual flat instructions.