next.js
fff4a4d9 - turbo-persistence: stop background persisting after unrecoverable failure (#92106)

Commit
33 days ago
turbo-persistence: stop background persisting after unrecoverable failure (#92106) ### What? When a persist or compaction operation fails in `turbo-persistence`, the database now: - Rolls back cleanly (deletes orphan files, restores CURRENT) - Stops the background persisting process for the session - Keeps in-memory state consistent with on-disk state at all times - Deletes superseded files safely (with Windows fallback for open memory maps) ### Why? Previously, a failed write operation (e.g. disk full, I/O error) would leave the database in a broken state: 1. **Misleading error loop** — The `active_write_operation` `AtomicBool` was left set to `true` after failure, so every subsequent snapshot cycle printed _"another write operation is already in progress"_ forever, hiding the real error. 2. **In-memory corruption** — `commit()` mutated `inner.meta_files` and `inner.current_sequence_number` *before* writing the CURRENT file to disk. If a disk error occurred between those two steps, the in-memory state was inconsistent with disk and the rollback had no way to fix it. 3. **Rollback could corrupt committed data** — If `commit()` failed *after* writing CURRENT (e.g. during old-file deletion or LOG writing), the rollback would delete the *newly committed* files, corrupting the database. 4. **Task graph corruption** — `save_snapshot` consumes task cache log entries. If it failed, those entries were lost, but the background loop would continue trying to persist — silently skipping those tasks and corrupting the task graph in storage. 5. **Partially written CURRENT** — If the failure happened mid-write to the CURRENT file, it could be left with partial/corrupt content, but nothing restored it. ### How? **`WriteOperationGuard` RAII (db.rs)** A new `WriteOperationGuard<'a>` replaces the `AtomicBool` + manual `try_recover_after_failed_write()` pattern. The guard holds: - `&'a Mutex<Option<ActiveWriteState>>` — the write slot (`None` = idle, `Some(Active("write batch"))` = in progress, `Some(Error)` = permanently disabled) - `path: &'a Path` — database directory for rollback - `seq_before: u32` — sequence number at operation start - `succeeded: bool` — set by `guard.success()` On `drop`, if not succeeded: 1. Writes `seq_before` back to CURRENT (repairs a partially-written CURRENT) 2. Deletes all files with `seq > seq_before` (orphans from the failed operation) 3. Sets the slot to `None` (success) or `Some(Error)` (if cleanup itself failed) The `Active` variant carries a `&'static str` name (e.g. `"write batch"`, `"compaction"`) used in error messages. **Three-phase `commit()` (db.rs)** `commit()` is restructured so `inner` is completely unmodified before the point of no return: | Phase | What happens | `inner` state | On failure | |-------|-------------|---------------|------------| | **A** | Compute `meta_seq_numbers_to_delete` via `sst_filter`. Uses `apply_filter_collect` (read-only) to update filter state and collect per-meta-file removal sets without modifying any MetaFile. Only a read lock on `inner` is needed. | Unchanged | Guard deletes orphan files + restores CURRENT; `inner` is intact | | **B** | Write `.del` file and CURRENT to disk. | Unchanged | Same as above | | **C** | Apply deferred `retain_entries` (from A's removal sets), append new metas, remove obsolete metas, bump `current_sequence_number`. Try to delete superseded files; defer failures. | Updated | CURRENT is already durable; commit is irreversible | After CURRENT is written (point of no return), LOG writing errors are caught and reported via `eprintln!` — they must not propagate because the `WriteOperationGuard` would then run its rollback and delete the *newly committed* files. **`SstFilter::apply_filter_collect` (sst_filter.rs)** A new read-only variant of `apply_filter` that updates the filter state and returns a `FxHashSet<u32>` of SST entry sequence numbers to remove from each meta file, without calling `retain_entries`. The original `apply_filter` (which mutates the MetaFile) is still used by `load_directory` and during new-meta-file construction where immediate mutation is appropriate. **Deferred file deletion (db.rs)** Superseded `.sst`/`.meta`/`.blob` files are deleted immediately after Phase C (once `inner` is updated). On Linux/macOS this always succeeds, even if concurrent readers have the files memory-mapped. On Windows, open memory maps prevent deletion — any file that fails is stored as a `DeferredDeletion` enum (`Sst(u32)` / `Meta(u32)` / `Blob(u32)`) and retried on the next commit or at shutdown. The `.del` file written during Phase B ensures crash recovery via `load_directory` regardless. **Background loop error handling (backend/mod.rs)** - `snapshot_and_persist()` returns `Result<(Instant, bool), anyhow::Error>` instead of `Option`. When `save_snapshot` fails, the error propagates with `?`. - The background loop matches on the `Result`: on `Err`, it logs the error and a message that persisting is disabled for this session, then returns (permanently stopping the background job). - `has_unrecoverable_write_error()` checks the `ActiveWriteState::Error` variant to detect permanent failure after compaction errors. <!-- NEXT_JS_LLM_PR --> --------- Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>
Author
Parents
Loading