perf: Skip redundant file writes during cache restore using manifests (#12209)
## Summary
On repeat cache-hit runs, skips writing files that are already correct
on disk. A manifest records the size, mtime, and file mode of each file
after restoration. On subsequent restores of the same hash, files
matching all three properties are skipped.
## Why
Cache restoration writes ~1.24GB to disk per run (5 Next.js apps × 248MB
decompressed each). On repeat runs where outputs haven't changed, every
byte is identical to what's already on disk. This dominated wall clock
time at ~880ms.
## Benchmark (110-package monorepo, local macOS, repeat cache-hit)
| Metric | Before | After |
|---|---|---|
| Wall clock | 1.5-1.6s | **453ms** |
| `cache_reader_restore` CPU | 4.4s (280%) | 245ms (54%) |
| `visit_recv_wait` | 931ms | 47ms |
## How it works
**On restore:** After writing each file, `stat()` it and record `(size,
mtime_nanos, mode)` in a manifest at
`.turbo/cache/{hash}-manifest.json`. The manifest is written
asynchronously on a background thread so cache misses pay zero overhead.
**On subsequent restore of the same hash:** Before writing each file,
check the manifest. If the file exists on disk with matching size,
mtime, and mode → skip the write, advance the tar stream with
`io::copy(entry, io::sink())`. Otherwise, write normally.
## Skip conditions
A write is skipped only when ALL are true:
1. Manifest exists for this hash
2. File exists on disk
3. Size matches recorded value
4. mtime matches recorded value (detects any external modification)
5. File mode matches recorded value (detects `chmod`)
If any condition fails, the file is written normally.
## Edge cases
| Scenario | Behavior |
|---|---|
| First restore (no manifest) | Full extraction, manifest written async
after |
| File manually edited | mtime changes → condition 4 fails → rewritten |
| File deleted | stat fails → condition 2 fails → rewritten |
| `chmod` on file | mode changes → condition 5 fails → rewritten |
| Different hash | Different manifest filename → no manifest → full
extraction |
| Cache cleared | Manifests deleted alongside archives |
| Corrupt manifest | JSON parse fails → `None` → full extraction |
| Killed mid-restore | Manifest never written (async write hasn't
started) → next run does full extraction |
| Remote cache download | No manifest exists locally → full extraction,
manifest written after |