Turbopack: Rework watcher data structures, store watched list and watcher behind the same RwLock (#82258)
Reworks the data structures used for the watcher to achieve a few goals:
- I want to avoid multiple locks, as I don't think we actually need two different locks, and multiple locks leaves more room for bugs (e.g. lock inversion).
- The information about if we're in non-recursive mode or recursive mode needs to be available outside of the lock for performance reasons.
- The `watching`/`watched` set needs to be ordered. We don't have a convenient concurrent BTreeSet or Trie/Radix Tree pulled in, and I'd like to avoid adding more dependencies. We end up locking `notify_watcher` anywhere we'd need a write lock anyways, so it might block a few reads, but otherwise it's not that bad.
Fixes in this PR:
- Fixes an issue where multiple concurrent calls to `ensure_watching` for the same directory could cause one of those calls to exit early. This may cause a file in that directory to be read before the filesystem watcher is set up, potentially missing events.
- Improves the time complexity of `restore_if_watching` using an ordered set to avoid iterating over all of `watched` to find child directories. See `OrderedPathSetExt`. This time complexity problem is introduced in the previous PR, #82130.
Other changes:
- It turns out that we don't need to explicitly set up any watchers during `start_watching` except for the `root_dir`. If paths have already been read, they'll set up watchers before re-reading the file when we invalidate it! This simplifies things, and means there's less stuff we need to track if you never start the watcher (i.e. `next build`).
- Added a `--start-watching-late` option to the fuzzer to test calling `start_watching` after all the initial reads happen, to help validate the safety of the claim in the previous bullet point.
Minor (aesthetic) changes:
- Use past tense: `watching` is now `watched` and `ensure_watching` is now `ensure_watched`, as the `watched` list includes both currently active paths *and* previously watched but deleted paths.
- Shorten the names of non-public types, because we have a lot more of them, and the names would get long otherwise
## Benchmarking
This replaces a `DashSet<_>` with (essentially) a `RwLock<BTreeSet<_>>`. There's some risk of increased contention, particularly in the read-only fast-path for `ensure_watched`/`ensure_watching`.
Under the assumption that the most heavily lock-contended use-case for this code is after restoring from persistent cache on a large repository:
1. Check out `vercel-site` in `front`.
2. Hack the new tracing span onto the canary version.
3. Build a release version with `pnpm pack-next --project ~/front/apps/vercel-site --no-js-build -- --release`
4. `rm -rf .next`
5. `NEXT_TURBOPACK_TRACING='turbo_tasks_fs' TURBO_ENGINE_IGNORE_DIRTY=1 pnpm next dev --turbopack`
6. Load `http://localhost:3000/vercel`. Wait for it to fully finish loading. Stop the server with `ctrl+c`.
7. Restart the server, wait for it to finish initializing, and then exit.
8. Start the trace viewer: `pnpm next internal trace .next/trace-turbopack`, and look at the aggregated times of `ensure_watched`/`ensure_watching`.
Results:
before (canary): 313ms ensure_watching (2235x spans)
after (this pr): 279ms ensure_watched (2235x spans)
So no significant difference.