turbo
db01cb44 - perf: Fast path for shallow wildcard glob patterns in workspace discovery (#11972)

Commit
74 days ago
perf: Fast path for shallow wildcard glob patterns in workspace discovery (#11972) ## Summary Workspace package discovery uses glob patterns like `packages/*/package.json` to find workspace packages. Previously, each pattern was dispatched as a single rayon task running a sequential `wax` directory walk. For large monorepos with hundreds of packages under one directory, the slowest walker dominated the entire phase. This PR adds a fast path for "shallow wildcard" patterns — globs with exactly one `*` segment between literal prefix and suffix. Instead of a full recursive directory walk, these are expanded via `readdir` + parallel `stat` calls spread across all rayon workers. ## Problem Profiling `turbo build --dry-run` on large monorepos revealed extreme load imbalance in the glob walker phase: | Glob pattern | Directory entries | Walk time | |---|---|---| | `packages/*/package.json` | ~600 | **54ms** | | `services/*/package.json` | ~320 | **41ms** | | `apps/*/package.json` | ~26 | 3ms | | Other patterns | 1–30 | <1ms | The slowest walker gates the entire `parse_package_jsons` phase. The 54ms walker runs sequentially on one rayon thread while other threads sit idle after finishing in <1ms. ## Approach `walk_compiled_globs` now partitions include patterns into three categories: 1. **Invariant** (no wildcards) — single `stat` syscall (existing optimization) 2. **Shallow wildcard** (`<prefix>/*/<suffix>`) — `readdir` + parallel `stat` (**new**) 3. **General variant** — full `wax` directory walk (existing fallback) `try_decompose_shallow_wildcard` detects eligible patterns: - Rejects `**`, `?`, `[`, `{` metacharacters - Requires exactly one bare `*` segment with a non-empty literal suffix - Falls back to the full walk for anything it can't handle ## Results Profiled with `--profile` on three monorepos of varying sizes (5, ~120, and ~1,200 packages): **Glob walk phase (workspace discovery):** | Metric | Before | After | |---|---|---| | Slowest `walk_glob` (large repo) | 54ms | 6ms | | `walk_glob` count (large repo) | 10 | 4 | | `compile_globs` (large repo) | 20ms | 2ms | | Slowest `walk_glob` (medium repo) | 14ms | 0.4ms | **End-to-end `turbo build --dry-run` (5-run average, profile-internal time):** | Repo size | Baseline | With this change | Improvement | |---|---|---|---| | ~1,200 packages | 606ms | 455ms | **-25%** | | ~120 packages | 308ms | 259ms | **-16%** | Note: "With this change" includes other recent perf improvements on this branch. The isolated improvement from this PR is ~60ms (~12%) on the large repo. ## Testing All 154 existing `globwalk` tests pass, including workspace-specific tests that exercise exclusion patterns, nested packages, and edge cases.
Author
Parents
Loading