turbo
57f915a4 - perf: Skip redundant file writes during cache restore using manifests (#12209)

Commit
76 days ago
perf: Skip redundant file writes during cache restore using manifests (#12209) ## Summary On repeat cache-hit runs, skips writing files that are already correct on disk. A manifest records the size, mtime, and file mode of each file after restoration. On subsequent restores of the same hash, files matching all three properties are skipped. ## Why Cache restoration writes ~1.24GB to disk per run (5 Next.js apps × 248MB decompressed each). On repeat runs where outputs haven't changed, every byte is identical to what's already on disk. This dominated wall clock time at ~880ms. ## Benchmark (110-package monorepo, local macOS, repeat cache-hit) | Metric | Before | After | |---|---|---| | Wall clock | 1.5-1.6s | **453ms** | | `cache_reader_restore` CPU | 4.4s (280%) | 245ms (54%) | | `visit_recv_wait` | 931ms | 47ms | ## How it works **On restore:** After writing each file, `stat()` it and record `(size, mtime_nanos, mode)` in a manifest at `.turbo/cache/{hash}-manifest.json`. The manifest is written asynchronously on a background thread so cache misses pay zero overhead. **On subsequent restore of the same hash:** Before writing each file, check the manifest. If the file exists on disk with matching size, mtime, and mode → skip the write, advance the tar stream with `io::copy(entry, io::sink())`. Otherwise, write normally. ## Skip conditions A write is skipped only when ALL are true: 1. Manifest exists for this hash 2. File exists on disk 3. Size matches recorded value 4. mtime matches recorded value (detects any external modification) 5. File mode matches recorded value (detects `chmod`) If any condition fails, the file is written normally. ## Edge cases | Scenario | Behavior | |---|---| | First restore (no manifest) | Full extraction, manifest written async after | | File manually edited | mtime changes → condition 4 fails → rewritten | | File deleted | stat fails → condition 2 fails → rewritten | | `chmod` on file | mode changes → condition 5 fails → rewritten | | Different hash | Different manifest filename → no manifest → full extraction | | Cache cleared | Manifests deleted alongside archives | | Corrupt manifest | JSON parse fails → `None` → full extraction | | Killed mid-restore | Manifest never written (async write hasn't started) → next run does full extraction | | Remote cache download | No manifest exists locally → full extraction, manifest written after |
Author
Parents
Loading