llvm-project
238a970d - [AMDGPU] DS loop wait relaxation -- more test cases and improvements to handle them (4/4)

Commit
22 days ago
[AMDGPU] DS loop wait relaxation -- more test cases and improvements to handle them (4/4) Add handling for same-iteration use/overwrite of DS load results: - Track DS load destinations and detect when results are used or overwritten within the same iteration - Compute FloorWaitCount for WMMAs that only use flushed loads Add bailout for tensor_load_to_lds and LDS DMA writes after barrier Add negative test based on profitability criteria Assisted-by: Cursor / claude-4.5-opus-high
Author
Parents
Loading