llvm-project
b09174b4 - [AMDGPU] Enable runtime loop unrolling (#194924)

Commit
5 days ago
[AMDGPU] Enable runtime loop unrolling (#194924) Enable auto runtime unrolling for AMDGPU by setting `UP.Runtime = true` in `getUnrollingPreferences`, with `PartialThreshold = Threshold / 4` to limit code-size growth. Benchmarked on **MI350X (gfx950)** and **MI300X (gfx942)** using Composable Kernel, xpu-perf, and llama.cpp. Results showed some some improvements and no real regressions. AI Disclaimer: Cursor was used to evaluate the change and run benchmarking experiments.
Author
Parents
Loading