next.js
b77eb3e1 - Turbopack: fix hanging problem due to stale tasks (#81413)

Commit
314 days ago
Turbopack: fix hanging problem due to stale tasks (#81413) ### What? When tasks become dirty they eventually need to be scheduled again when needed. To do that we maintain "activeness" of tasks. And we also maintain "dirtyness" of subgraphs, to allow for strongly consistency of a subgraph. But all that is a bit more involved since we don't want to touch all tasks of a subgraph (a subgraph could be millions of nodes). So we do some "aggregation" of subgraphs to optimize the affected tasks. Once a task becomes dirty, we propagate that dirtyness up the aggregated tasks: Every aggregated task has a list of inner tasks which subgraph contain dirty tasks (`dirty_containers`). This way we can follow the graph directly down the dirty tasks without walking the whole graph. This is where activeness comes into play. When a task is active we want to schedule all dirty tasks in the subgraph. This can happen under 2 cases: 1. A task becomes active -> all dirty_containers are scheduled 2. A dirty_container propagates to an already active task -> that task is scheduled There is this case where a task is newly connected to an active task. This is covered by case 2, because a newly connected tasks will apply its aggregated info to the upper case, which hit case 2 then. All root tasks are active as long as they are relevant (you can dispose them, or some are once off tasks). So when a task becomes dirty, it propagates the `dirty_container` to the root tasks, which would schedule the subgraph by walking the dirty_containers. But it would break if another task becomes dirty below the same aggregated task. The aggregated task is already listed as `dirty_container` in the root tasks and it would not be scheduled again. To handle this all aggregated tasks, that are listed as `dirty_container` of an active task, are made temporarily active (`active_until_clean`). This also has the benefit that we don't have to do so many hops to schedule a task. So this works in most cases, but there is a race condition in this design which this pull request fixes. We said only aggregated tasks are made temporarily active. But there is this edge case where while a task is already dirty, it is converted from a leaf task into an aggregated tasks and a inner task becomes dirty. This leads to the problem that the newly aggregated task is not temporarily active - as it was not an aggregated task when it was scheduled. So the inner task is not scheduled - since the upper task is not (temporarily) active. So the task is never executed and stays stale. But a strongly consistent read further up the graph will wait for this task to become not-dirty - since it is listed as `dirty_container`. This leads to a hanging build. To fix that we make all tasks temporarily active, even leaf tasks.
Author
Parents
Loading