llvm-project
ed395c89 - [AMDGPU] Use value's DebugLoc for bitcast in performStoreCombine (#186766)

Commit
12 days ago
[AMDGPU] Use value's DebugLoc for bitcast in performStoreCombine (#186766) ## Description When `AMDGPUTargetLowering::performStoreCombine` inserts a synthetic bitcast to convert vector types (e.g. `<1 x float>` → `i32`) for stores, the bitcast inherits the **store's** SDLoc. When `DAGCombiner::visitBITCAST` later folds `bitcast(load)` → `load`, the resulting load loses its original debug location. ## Analysis The bitcast is **not** present in the initial SelectionDAG — it is inserted during DAGCombine by `AMDGPUTargetLowering::performStoreCombine`. This can be observed with `-debug-only=isel,dagcombine`: ``` Initial selection DAG: no bitcast, load is v1f32 directly used by store Combining: t17: ch = store ... /tmp/beans.c:6:14 ... into: t20: ch = store ... /tmp/beans.c:6:14 Combining: t19: i32 = bitcast [ORD=3] # D:1 t13, /tmp/beans.c:6:14 ... into: t21: i32,ch = load ... /tmp/beans.c:6:14 ``` In `performStoreCombine` (`AMDGPUISelLowering.cpp`): ```cpp SDLoc SL(N); // N = store node → SL has store's DebugLoc ... SDValue CastVal = DAG.getNode(ISD::BITCAST, SL, NewVT, Val); // bitcast gets store's DebugLoc, not load's ``` When `visitBITCAST` folds `bitcast(load)` → `load`, it uses `SDLoc(N)` (the bitcast's loc = store's loc), so the resulting load loses its original debug location. ``` Before (initial DAG): t13: v1f32 = load ... line 2 ; original load t14: ch = store t13, ... line 3 ; store After performStoreCombine: t13: v1f32 = load ... line 2 ; original load t19: i32 = bitcast t13 line 3 ; synthetic bitcast (store's loc!) t20: ch = store t19, ... line 3 After visitBITCAST folds (incorrect): t21: i32 = load ... line 0 ; lost debug location After visitBITCAST folds (expected): t21: i32 = load ... line 2 ; preserves load's location ``` ## Fix Target-specific fix in `AMDGPUISelLowering.cpp` `performStoreCombine`: use `DAG.getBitcast()` instead of `DAG.getNode(ISD::BITCAST, SL, ...)`. `getBitcast()` internally uses `SDLoc(V)` (the value operand's SDLoc), so the synthetic bitcast naturally inherits the load's DebugLoc instead of the store's: ```cpp // Before: SDValue CastVal = DAG.getNode(ISD::BITCAST, SL, NewVT, Val); if (OtherUses) { SDValue CastBack = DAG.getNode(ISD::BITCAST, SL, VT, CastVal); // After: SDValue CastVal = DAG.getBitcast(NewVT, Val); if (OtherUses) { SDValue CastBack = DAG.getBitcast(VT, CastVal); ``` This is consistent with `performLoadCombine` where the bitcast also uses the load's `SDLoc`.
Author
Parents
Loading