perf: Optimize engine builder, task visitor, and untracked file discovery (#11956)
## Summary
Three targeted optimizations to the `turbo run` hot path, focused on
areas identified via `--profile` on repos of varying sizes.
## Benchmarks
Hyperfine (30 runs, 10 warmup) comparing mainline to this branch across
three internal monorepos:
### Large repo (~1000 packages)
| | Mean | Range |
|---|---|---|
| **mainline** | 1.265s ± 0.176s | 1.091s – 1.756s |
| **this PR** | 1.514s ± 0.115s | 1.370s – 1.801s |
| | **1.20–1.26× faster** | |
`build_engine` self-time: 283ms → 74ms (−74%). `queue_task` self-time:
330ms → 181ms (−45%).
### Medium repo (~120 packages)
| | Mean | Range |
|---|---|---|
| **mainline** | 807ms ± 82ms | 744ms – 1149ms |
| **this PR** | 823ms ± 71ms | 765ms – 1055ms |
| | **~1.02× (within noise)** | |
### Small repo (~5 packages)
| | Mean | Range |
|---|---|---|
| **mainline** | 581ms ± 40ms | 523ms – 714ms |
| **this PR** | 579ms ± 50ms | 512ms – 708ms |
| | **~1.00× (within noise)** | |
The optimizations scale with `packages × tasks`, so the large repo sees
the most benefit. Medium and small repos are dominated by filesystem I/O
(`find_untracked_files`) and lockfile parsing, which are unaffected.
## Changes
- **Engine builder**: Cache the turbo.json extends chain per package
name and move the `visited` set check before `task_definition()`. The
chain resolution only depends on the package, not the task, so all tasks
in the same package share the cached chain. The early `visited` check
avoids recomputing task definitions for duplicate BFS entries.
- **Task visitor**: Defer the `env()` call to the non-dry-run branch.
The execution environment map is unused during dry runs, so this skips
per-task RwLock acquisition, `DetailedMap` cloning, and wildcard regex
matching in that path.
- **`find_untracked_files`**: Replace `Mutex<Vec>` with per-thread local
buffers that flush via `mpsc` channel on drop. This eliminates per-file
mutex contention in the parallel directory walker.
## Testing
All existing tests pass across the three modified crates:
- `turborepo-engine`: 62 tests (covers extends chains, cycles, diamond
inheritance, `extends: false`, BFS graph construction)
- `turborepo-scm`: 110 tests (covers untracked file detection, git index
equivalence, package boundary isolation)