llvm-project
b3c4d44c - [lldb] Batch breakpoint step-over for threads stopped at the same BP (#183412)

Commit
8 days ago
[lldb] Batch breakpoint step-over for threads stopped at the same BP (#183412) When multiple threads are stopped at the same breakpoint, LLDB currently steps each thread over the breakpoint one at a time. Each step requires disabling the breakpoint, single-stepping one thread, and re-enabling it, resulting in N disable/enable cycles and N individual vCont packets for N threads. This is a common scenario for hot breakpoints in multithreaded programs and scales poorly. This patch batches the step-over so that all threads at the same breakpoint site are stepped together in a single vCont packet, with the breakpoint disabled once at the start and re-enabled once after the last thread finishes. At the top of WillResume, any leftover StepOverBreakpoint plans from a previous cycle are popped with their re-enable side effect suppressed via SetReenabledBreakpointSite, giving a clean slate. SetupToStepOverBreakpointIfNeeded then creates fresh plans for all threads that still need to step over a breakpoint, and these are grouped by breakpoint address. For groups with multiple threads, each plan is set to defer its re-enable through SetDeferReenableBreakpointSite. Instead of re-enabling the breakpoint directly when a plan completes, it calls ThreadFinishedSteppingOverBreakpoint, which decrements a per-address tracking count. The breakpoint is only re-enabled when the count reaches zero. All threads in the largest group are resumed together in a single batched vCont packet. If some threads don't complete their step in one cycle, the pop-and-recreate logic naturally re-batches the remaining threads on the next WillResume call. For 10 threads at the same breakpoint, this reduces the operation from 10 z0/Z0 pairs and 10 vCont packets to 1 z0 + 1 Z0 and a few progressively smaller batched vCont packets. EDIT: Tried to merge this PR twice, the first time the test was flaky so we had to revert. The second time, we broke 2 tests on windows machine: https://lab.llvm.org/buildbot/#/builders/141/builds/15798 The tests that were failing were failing because the cleanup code in `WillResume` was popping **ALL** `StepOverBreakpoint` plans, including non-deferred ones from incomplete single-steps. The issue was: 1) Multiple threads hit the same breakpoint. One thread's breakpoint condition evaluates to false, so it needs to auto-continue. 2) A `StepOverBreakpoint` plan is created for that thread (non-deferred). 3) On the next WillResume, the cleanup pops that non-deferred plan. 4) Now the `StopOthers` scan finds no thread with a StopOthers() plan, so thread_to_run stays null. 5) The else branch runs, calling `SetupToStepOverBreakpointIfNeeded` on **ALL** threads, including the thread that legitimately hit the breakpoint with a true condition. 6) That thread gets a new `StepOverBreakpoint` plan pushed, which overwrites its breakpoint stop reason with trace when the step completes. The error `trace (2) != breakpoint (3)` confirms this, the thread that should have reported breakpoint as its stop reason instead reports trace, because an unwanted `StepOverBreakpoint` plan was pushed on it and completed. The newly added code fixes it by only popping plans that have `GetDeferReenableBreakpointSite() == true` Co-authored-by: Bar Soloveychik <barsolo@fb.com>
Author
Parents
Loading