llvm
e9c78be3 - Reland [MC] Fuse relaxation and layout into a single forward pass (#190318)

Commit
2 days ago
Reland [MC] Fuse relaxation and layout into a single forward pass (#190318) This relands debb2514ea7f, which was reverted by #189548 due to ARM spurious `cbz` out of range error (Chromium, Android). --- Replace the two-pass inner loop in relaxOnce (relaxFragment + layoutSection) with a single forward pass that sets each fragment's offset before processing it. - Extract relaxAlign from layoutSection's FT_Align handling and call it from relaxFragment. FT_Align padding is computed inline with the tracked Offset, so alignment fragments always see fresh upstream offsets. This structurally eliminates the O(N) convergence pitfall where stale offsets caused each iteration to fix only one more alignment fragment. - The new MCAssembler::Stretch field tracks the cumulative upstream size delta. In evaluateFixup, for PCRel fixups during relaxation, Stretch is added to forward-reference target values (LayoutOrder comparison). This makes displacement = target_old - source_old, identical to the old two-pass approach, preventing premature relaxation for span-dependent instructions. - FT_Fill/FT_Org removed from relaxFragment; `if (F.Offset != Offset) in the fused loop detects their size changes. - layoutSection is retained for initial layout and post-finishLayout. This fixes the FT_BoundaryAlign linear time convergence issue reported by #176535. In addition, backward branches near the short/near boundary may benefit from tighter encoding when a .p2align between the target and the branch absorbs upstream growth (see relax-branch-align.s). Key commits that updated relaxFragment/layoutSection: - 742ecfc13e8a [MC] Relax MCFillFragment and compute fragment offsets eagerly - 9f66ebe42715 MC: Eliminate redundant fragment relaxation - df71243fa885 MC: Evaluate .org during fragment relaxation - b1d58f025e83 MCAssembler: Simplify fragment relaxation - 58d16db8b5d2 MCAssembler: Simplify relaxation of FT_Fill and FT_Org --- Fix for the ARM regression (see thumb-ldr-stretch.s): ARM `evaluateFixup` pre-seeds Value with `(F.Offset + fixup_offset) % 4` for Thumb's AlignDown(PC, 4) semantics. The full displacement formulas: ``` Old two-pass: (source_old % 4) + target_old - source_old = target_old - alignDown(source_old, 4) Reverted buggy fused: (source_new % 4) + target_old - source_new + Stretch = target_old + Stretch - alignDown(source_new, 4) Fixed fused: ((source_new - Stretch) % 4) + target_old - source_new + Stretch = (source_old % 4) + target_old - source_old = target_old - alignDown(source_old, 4) ``` With the fused pass, F.Offset is already updated by Stretch, so the pre-seed used the new offset while the generic Stretch compensation assumed the old offset, producing a misaligned displacement. This caused spurious tLDRpci -> t2LDRpci relaxation (+2 bytes each), which cascaded to push the initial CBZ target out of range. The fix uses pre-Stretch source offsets for the AlignDown(PC, 4) pre-seed in evaluateFixup, leading to the same displacement as the old two-pass algorithm.
Author
Parents
Loading