next.js
e2fca6c6 - [Turbopack] Add graph-based CSS chunking algorithm behind experimental.cssChunking: "graph" (#93606)

Commit
8 hours ago
[Turbopack] Add graph-based CSS chunking algorithm behind experimental.cssChunking: "graph" (#93606) ### What? Adds an alternative CSS chunking algorithm to Turbopack, opted into via: ```js // next.config.js module.exports = { experimental: { cssChunking: 'graph', // or, with explicit cost overrides: // cssChunking: { type: 'graph', requestCost: 20_000, moduleFactorCost: 1 }, }, } ``` The new algorithm is **off by default** — Turbopack still uses the existing "loose"/dependencies algorithm unless this flag is set, so this PR is a pure addition for users that don't opt in. While we were here, the `experimental.cssChunking` shape was also generalized so every existing string accepts an object form too: | Value | Bundler | Notes | |---|---|---| | `true` / `'loose'` / `{ type: 'loose' }` | both | default heuristic-based chunking | | `'strict'` / `{ type: 'strict' }` | webpack | unchanged | | `false` | webpack | unchanged (one chunk per CSS module) | | `'graph'` / `{ type: 'graph', requestCost?, moduleFactorCost? }` | Turbopack | new | Cross-bundler combinations are rejected at config-validation time: - `'graph'` with webpack throws. - `'strict'` and `false` with Turbopack throw. ### Why? The existing Turbopack CSS chunker (loose / dependencies) is good at preserving CSS ordering but doesn't share chunks across pages well — every page tends to load its own chunk per CSS module, which scales poorly for apps with many pages and shared component libraries. The new "graph" algorithm models the per-chunk-group CSS ordering as a weighted DAG over modules, then greedily merges adjacent runs in the global topological order whenever the merge reduces total cost. The cost model charges every CSS request and overshipped byte, with two tunable knobs (`requestCost` and `moduleFactorCost`). **Trade-off vs. the loose default.** With the default cost parameters (`requestCost: 20_000`, `moduleFactorCost: 1`) the graph algorithm typically ships **less CSS per chunk group at the cost of more requests** than the loose algorithm. The cost model is tuned to avoid overshipping unrelated CSS into pages that don't need it; on apps where the loose algorithm was already collapsing a lot into one big chunk that some pages didn't actually use, the graph algorithm will split it. Apps that prefer fewer requests can raise `requestCost`; apps that prefer less overshipping can raise `moduleFactorCost`. This is opt-in and Turbopack-only because: - The cost model is sensitive to per-app properties (number of pages, size distribution of CSS modules, …) — keeping it experimental gives us room to tune defaults from real usage. - Webpack already has its own `CssChunkingPlugin` and `'strict'` mode that cover the equivalent design space; we don't want to fork that. ### Performance Measured on `vercel.com` (the full graph algorithm spans `create_graph → make_acyclic → linearize → split_into_chunks → assemble`): - **~3s** end-to-end for the synchronous chunking pipeline on a realistic production input. Implementation choices that matter for that throughput: - Tarjan SCC uses `Vec<u32>` / `Vec<bool>` scratch arrays indexed by `NodeIndex` — no hashing on `indices` / `lowlinks` / `on_stack`. - `make_acyclic` batches multiple cuts per SCC pass by seeding successive short-cycle searches at the previous cut's target, only re-running Tarjan when no further cycle is reachable from the seed. - `find_short_cycle` is a bidirectional Dijkstra over a `BinaryHeap` with predecessor pointers (no path cloning) and skips its refinement loop for trivial 2-cycles. - `split_into_chunks` picks the next merge from a `BinaryHeap` keyed on the cost delta instead of an O(N) linear scan per merge. - `chunk_cost` reads a once-built `module_to_groups` inverse index instead of scanning every chunk group on every call; the GlobalStyle leakage check uses binary search on the inverse index rather than scanning each group's module list. ### How? #### Module layout (`turbopack/crates/turbopack-core/src/module_graph/`) The two algorithms are deliberately split so neither imports from the other: - `style_groups/` — algorithm-neutral output types (`StyleGroups`, `StyleItemInfo`, `make_style_groups`). Both algorithms produce these. - `style_groups_loose/` — the existing ("loose") algorithm plus the shared config types (`StyleGroupsAlgorithm`, `StyleGroupsConfig`, `F32TaskInput`). - `style_groups_graph/` — the new algorithm. Pure Rust, no `Vc`, with `petgraph::DiGraph` plus a thin `SubgraphView` wrapper and a small `ReadonlyGraph` trait that lets the same pipeline run against either a `&DiGraph` or a filtered view of one SCC. #### Algorithm ```text create_graph → make_acyclic → linearize → split_into_chunks → assemble batches ``` 1. **`create_graph`** — for each chunk group, every `(later, earlier)` pair inside the group's CSS-module list becomes an edge `later → earlier` (weight 1, accumulated). Heavy edges = strong co-occurrence. 2. **`make_acyclic`** — co-occurrence almost always introduces cycles; each multi-node SCC has its lowest-weight cycle edge cut until the graph is a DAG. 3. **`linearize`** — Kahn-style topological sort with a tie-break on edge weight, so strongly co-occurring modules end up adjacent in the global order. 4. **`split_into_chunks`** — greedy bottom-up merger over the global order. At every active split point we score the merge as `cost(merged) - cost(left) - cost(right)`, take the most-negative score from a min-heap, and repeat until no merge would reduce cost. `max_chunk_size` and "global CSS must not leak into unrelated chunk groups" are enforced as `+infinity` cost. The cost model is: ```text cost_per_group(chunk, group) = chunk_size + (chunk_size / group_total_size) * module_factor_cost + request_cost ``` summed over the chunk groups that load the chunk. #### Wiring - `StyleGroups::shared_chunk_items` is a `FxIndexMap<ChunkItemWithAsyncModuleInfo, StyleItemInfo>` where `StyleItemInfo { order: Option<u32>, batch: Option<…> }`. The graph algorithm fills `order` so `style_production.rs` can stable-sort chunks globally; the legacy algorithm leaves `order = None`, which makes the sort a no-op for it. `flatten_and_sort` returns the `StyleItemInfo` references alongside each chunk item so the per-item loop doesn't re-query the map. - A new `StyleGroupsAlgorithm` enum on `ChunkingConfig` selects the algorithm at chunking time; `ModuleGraph::style_groups` dispatches to either `compute_style_groups` (existing) or `compute_style_groups_graph` (new). - `next-core` exposes `NextConfig::css_chunking() -> Vc<CssChunkingAlgorithm>` resolving the JS `experimental.cssChunking` to the Rust enum, with cost defaults applied (`requestCost: 20_000`, `moduleFactorCost: 1`). All three chunking-context constructors (`next_client`, `next_edge`, `next_server`) thread it through. #### Configuration - `experimental.cssChunking` zod schema accepts the new shapes; cost params are `z.number().nonnegative().finite().optional()`. - `config-shared.ts` exports a `CssChunkingConfig` type alias and a `resolveCssChunkingMode(value)` helper that normalizes any input to one of `'off' | 'loose' | 'strict' | 'graph'`. Both `webpack-config.ts` (plugin wiring) and `config.ts` (bundler-compat validation) use the helper. - New `errors.json` entries for the three bundler-compatibility validation errors (E1193 graph-on-webpack, E1194 strict-on-Turbopack, E1195 false-on-Turbopack). #### Tests - 53 Rust unit tests in `style_groups_graph/tests.rs` cover `create_graph`, Tarjan SCC, `find_short_cycle` (bidirectional Dijkstra), `make_acyclic`, `linearize`, `split_into_chunks`, and end-to-end pipeline scenarios. - `test/e2e/app-dir/css-order/css-order.test.ts` is parametrised over `[label, value]` pairs. The Turbopack matrix now includes `'graph'` and an object-form `{ type: 'graph', requestCost: 1, moduleFactorCost: 1 }` in addition to the existing default. Per-page expectations grew a `requests` object encoding distinct request counts for `loose` and `graph` where they differ. - A new `sandwich` e2e fixture (`/sandwich/a`, `/sandwich/b`) exercises the case where two pages share a leading and trailing chunk around a unique middle stylesheet — including a global stylesheet that the algorithm must not leak into unrelated chunk groups. The graph algorithm hits the optimal 3 chunks per page on this fixture; loose mode falls short. #### Documentation - `ExperimentalConfig.cssChunking` JSDoc describes every accepted shape and what each cost knob does. - The `style_groups_graph` module-level docs describe the pipeline, cost model and constraints with diagrams. Closes NEXT- <!-- NEXT_JS_LLM_PR --> --------- Co-authored-by: v-work-app[bot] <262237222+v-work-app[bot]@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com>
Author
Parents
Loading