next.js
f0c1ffc4 - Remove ineffective turbo-tasks (#91341)

Commit

82 days ago

Remove ineffective turbo-tasks (#91341) ## Remove ineffective turbo-tasks Identifies and removes turbo-tasks functions where the task overhead exceeds the value they provide. Each turbo-task carries ~4-6μs execution overhead per miss and ~200-500ns per cache hit, plus allocations and bookkeeping. ### What? Removes 22 `#[turbo_tasks::function]` implementations across resolve plugins, chunk items, and resolve-result helpers — converting them to plain methods or inlining their work. Changes fall into a few buckets: - **ResolvePlugin condition handling** (`AfterResolvePluginCondition::matches`, `BeforeResolvePluginCondition::matches`, `after_resolve_condition`, `before_resolve_condition`): conditions now store the resolved `Glob` as a `ReadRef<Glob>` on the plugin struct at construction, so `matches` is a pure sync function and the per-plugin `*_resolve_condition` getters are trivial field reads (no longer turbo-tasks). The `after_resolve` / `before_resolve` hooks themselves stay as `#[turbo_tasks::function]` — they synthesize virtual sources/modules and need memoization on `(self, lookup_path, reference_type, request)` to avoid distinct cells producing duplicate module-graph idents. - The basic theory here is that the right level of caching is at `resolve` and at the hook bodies themselves, not the conditions or condition getters. - `AfterResolvePluginCondition` and `BeforeResolvePluginCondition` are marked `serialization = "none"` because `ReadRef` cannot be persisted; plugin construction is cheap enough to re-derive on restore. - **ChunkItem trait methods** (`chunking_context`, `ty`, `content_with_async_module_info`): returned constants or simple field reads, zero cache hits and no `.await` calls (no invalidation value). - **ResolveResult / ModuleResolveResult helpers** (`primary_modules`, `first_module`, `first_source`, `primary_sources`, `is_unresolvable`, `primary_output_assets`): simple iterators over already-resolved data; converted to plain methods. Added a `Duplicate(usize)` variant to `ModuleResolveResultItem` to handle dedup at construction time instead of in a separate task. - The basic idea here is that it is reasonable to consume `ResolveResult/ModuleResolveResult` monolithically, and we get little to no benefit from fine grained access. e.g. `is_unresolved()` in theory that is a valuable turbotask, but since it rarely changes but generally if we change how we resolve an import then we have to regenerate code, so saving a few boolean conditions is unlikely to be very valuable. - Misc: `EcmascriptModuleAsset::analyze`, `is_types_resolving_enabled`, `next_server::resolve::condition`. ### Impact (vercel-site build, dev first-compile) | Metric | Before | After | Δ | |---|---:|---:|---:| | Total cache hits | 30,885,827 | 29,201,314 | −1,684,513 | | Total cache misses | 6,473,123 | 5,953,626 | **−519,497** | | Overall hit rate | 82.67% | 83.06% | +0.39 pp | | Registered task functions | 1,294 | 1,272 | −22 | The 22 removed tasks were collectively responsible for ~519K misses per build — each miss previously paying the full execution overhead. Most of the work from `EcmascriptModuleAsset::analyze` naturally migrated into `analyze_ecmascript_module` (the task it was wrapping; +129K hits there). ### On-disk cache size (persistent caching) Each removed task also stops allocating cache cells on disk. Measured on the same vercel-site build with `.next/cache/turbopack` (persistent cache enabled): | | Size | |---|---:| | canary | 2.56 GiB | | this branch | 2.46 GiB | | **saved** | **~100 MiB (−3.81%)** | ### Build-time wall clock and peak memory Ran `pnpm next build --experimental-build-mode=compile` 5 times on each branch **Peak RSS — clear reduction:** | | canary | branch | Δ | |---|---:|---:|---:| | min | 19.18 GiB | 18.94 GiB | | | **median** | **19.22 GiB** | **19.01 GiB** | **−217 MiB (−1.10%)** | | mean | 19.21 GiB | 19.02 GiB | −199 MiB (−1.01%) | | max | 19.23 GiB | 19.13 GiB | | Every branch run has lower RSS than every canary run — the distributions don't overlap. Welch's t = −6.03. **Wall time — no measurable change:** | | canary | branch | Δ | |---|---:|---:|---:| | min | 62.03s | 60.78s | | | **median** | **62.61s** | **62.65s** | **+0.04s (+0.06%)** | | mean | 62.83s | 63.80s | +0.96s (+1.53%) | | max | 64.25s | 68.23s | | | stddev | 0.84s | 3.42s | | Median is flat. The mean difference is within noise (Welch's t = +0.61, n = 5). Branch run-to-run variance is higher — one 68.23s outlier pulls the mean up — so this is neither a regression nor a measurable speedup at this sample size.

References

#91341 - Remove ineffective turbo-tasks

Author

lukesandberg

Parents

e83bc939

next.js f0c1ffc4 - Remove ineffective turbo-tasks (#91341)

next.js
f0c1ffc4 - Remove ineffective turbo-tasks (#91341)