next.js
c13536d0 - turbo-tasks-backend: batch find_and_schedule_dirty using for_each_task_meta (#91497)

Commit

41 days ago

turbo-tasks-backend: batch find_and_schedule_dirty using for_each_task_meta (#91497) ### What? Batch-process `find_and_schedule_dirty` in `aggregation_update.rs` by collecting all queued jobs (up to `FIND_AND_SCHEDULE_BATCH_SIZE` = 10 000) into a `SmallVec` and pre-fetching their task metadata with a batched `ctx.for_each_task_meta(...)` call. ### Why? `find_and_schedule` can accumulate thousands of tasks during invalidation cascades. The previous implementation issued one `ctx.task(...)` call per task inside `process()`, serializing backing-storage fetches one at a time. `ctx.for_each_task_meta` triggers a batched fetch of all task metadata from the backing store: keys are sorted by hash for cache-friendly sequential access to the storage layer. The callback is invoked per-task once data is ready, with the task guard handed directly to the callback — no second lock acquisition needed. The batch limit is set to 10 000, because find-and-schedule jobs are cheap (metadata read + optional schedule) compared to aggregation-update jobs, so yielding less often is safe and beneficial. ### How? **`fn process` — `find_and_schedule` branch:** - Replace the one-at-a-time loop with a single `drain(..FIND_AND_SCHEDULE_BATCH_SIZE)` that collects up to 10 000 `FindAndScheduleJob` structs into a `SmallVec`, then calls `find_and_schedule_dirty` once with the whole batch. **`fn find_and_schedule_dirty`:** - Change parameter from `task_id: TaskId` to `jobs: SmallVec<[FindAndScheduleJob; 4]>`. - Call `ctx.for_each_task_meta(...)` to batch-prefetch all task metadata (sorted by hash for cache-friendly access) and process each task in the callback. **`prepare_tasks_with_callback` fix:** - Release `TaskLockCounter` *before* calling `prepared_task_callback` instead of after. This ensures the counter is 0 when the callback runs, so callbacks that drop their task guard and then call `ctx.task()` (like `find_and_schedule_dirty_internal` → `ctx.schedule()`) no longer trigger the "Concurrent task lock acquisition detected" panic in debug builds. - `for_each_task` now uses `acquire()` instead of `reacquire()` since counter is guaranteed 0 at callback entry. `reacquire()` is removed. **Cleanup:** - Use `FxHashMap` (already imported) instead of `std::collections::HashMap`. - Combine consecutive `#[cfg(trace_find_and_schedule)]` let bindings for clarity. - Add `FIND_AND_SCHEDULE_BATCH_SIZE = 10_000` as a self-contained constant (not derived from `MAX_COUNT_BEFORE_YIELD`). --------- Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>

References

#91497 - turbo-tasks-backend: batch find_and_schedule_dirty using for_each_task_meta

Author

sokra

Parents

62393b04

next.js c13536d0 - turbo-tasks-backend: batch find_and_schedule_dirty using for_each_task_meta (#91497)

next.js
c13536d0 - turbo-tasks-backend: batch find_and_schedule_dirty using for_each_task_meta (#91497)