next.js
ed1c6ccb - perf(ecmascript): shrink JsValue 64→32 bytes (#93106)

Commit

65 days ago

perf(ecmascript): shrink JsValue 64→32 bytes (#93106) Three changes that together shrink `JsValue` from **64 → 32 bytes** (−50%). JsValue is the analyzer's value type and is instantiated millions of times during large-app analysis, so this directly reduces peak memory and improves cache locality on the hot `link`/`replace_builtin` paths. ## Changes ### 1. Box `RequireContextValue` inside `WellKnownFunctionKind` `RequireContextValue` carries an `FxIndexMap` and is only used by three variants (`RequireContextRequire{,Keys,Resolve}`). Boxing it pulls those variants' payload down from 48B to 8B, dropping `WellKnownFunctionKind` from 64 → 16 bytes and `JsValue` from 64 → 48 bytes. ### 2. Unify `MemberCall`/`New`/`Call` payload into a single `Vec` `MemberCall` was `(u32, Box<JsValue>, Box<JsValue>, Vec<JsValue>)` (48B payload). Collapsed into `(u32, Vec<JsValue>)` (24B payload) with storage layout `[args..., prop, obj]`. The reversed order is the key trick: on the common fallthrough in `builtin::replace_builtin` (convert `obj.prop(args)` into a `Call(Member(obj, prop), args)`), we pop `obj` then `prop` off the tail of the Vec and reuse the remainder **as** the args Vec — zero extra allocations on the hot path. A similar optimization was applied to `New` and `Call` A tricky think about this layout is avoiding reallocations of 'arg' vecs when constructing, To make this easier I added parallel factory methods `JsValue::call_from_parts` and `JsValue::call_from_iter` to eliminate reallocations when constructing Drops JsValue from 48 → 40 bytes. ### 3. Replace `Unknown::reason` with `RcStr` The `reason` field of Unknown was a `Cow<&'static, str>` but nearly all uses were `&'static str` values so they were migrated to `rcstr!` which drops the reason field from 24 bytes to 8 bytes. Drops JsValue from 40 -> 32 bytes ## Test plan - [x] `cargo test -p turbopack-ecmascript --lib` — all 352 tests pass (snapshot fixtures byte-identical) ## Benchmark results Ran `cargo bench -p turbopack-ecmascript --bench analyzer` on this branch vs `canary` (59 fixtures × 2 benches each, criterion baselines). | Bench | Sum (canary → branch) | Δ | Geomean ratio | |---|---|---|---| | `link` | 72.26 ms → 66.41 ms | **−8.09%** | **−2.94%** | | `create_graph` | 8.15 ms → 7.93 ms | **−2.60%** | **−2.95%** | The biggest absolute wins are on `link`, the JsValue-heavy hot path (linker walks every node, clones, hashes via `similar_hash`, compares via `similar`). **Notable wins on `link`:** - `react-dom-production` **−5.70%** (38.40 ms → 36.21 ms — the largest fixture, ~2 ms saved) - `peg` **−11.39%** (25.65 ms → 22.72 ms) - `cycle-cache` **−9.04%** (3.94 ms → 3.58 ms) - `md5` **−19.19%** (1.09 ms → 879 µs) - `md5-reduced` **−12.75%** (386 µs → 337 µs) - `md5_2` **−11.33%** (783 µs → 694 µs) **Notable wins on `create_graph`:** - `webpack-target-node` **−8.30%**, `require-context` **−7.67%**, `peg` **−5.53%**, `cycle-cache` **−4.36%**, `mongoose-reduced` **−6.49%**, `process-and-os` **−6.11%** **Regressions:** zero in `create_graph`. Three nominal "+1.5% to +3.5%" outliers in `link` (`link/2`, `link/object`, `link/try`) all reproduce as **−4% to −7% improvements** when re-run individually against canary — they are run-to-run noise on small fixtures (criterion CI ≈ ±2-3%), not real regressions. Every fixture above ~1 ms shows a clear improvement consistently.  --------- Co-authored-by: Luke Sandberg <lukesandberg@users.noreply.github.com>

References

#93106 - perf(ecmascript): shrink JsValue 64→32 bytes

Author

mmastrac

Parents

d55787fe

next.js ed1c6ccb - perf(ecmascript): shrink JsValue 64→32 bytes (#93106)

next.js
ed1c6ccb - perf(ecmascript): shrink JsValue 64→32 bytes (#93106)