next.js
ea56922a - turbo-persistence: add CRC32 block checksums (#90754)

Commit
56 days ago
turbo-persistence: add CRC32 block checksums (#90754) ### What? Add a 4-byte CRC32 checksum to every block in turbo-persistence SST files. ### Why? Detect on-disk cache corruption early with a clear error message, rather than silently returning wrong data or hitting confusing LZ4 decompression failures. ### How? **New on-disk block format:** ``` [4B uncompressed_length] [4B CRC32 checksum] [block data] ``` (Previously the checksum field did not exist — header was just the 4B uncompressed_length.) - The checksum is computed on the **uncompressed** block data using `crc32fast` (hardware-accelerated CRC32C on modern CPUs). - On write, the checksum is computed before compression and stored in the block header. - On read, the checksum is verified after decompression (or directly for uncompressed blocks). A mismatch returns an `anyhow` error with a message like: `Cache corruption detected: checksum mismatch in block N of SST file seq:M`. - During compaction, medium values are copied as raw compressed blocks without decompression — the checksum is carried through the `MediumRaw`/`LazyLookupValue::Medium` path unchanged. - A `BLOCK_HEADER_SIZE` constant (= 8) replaces magic numbers across the write, read, and inspect code. **Files changed:** - `compression.rs` — `checksum_block()` helper wrapping `crc32fast::hash()` - `static_sorted_file_builder.rs` — write path (`write_raw_block_to_file`, `write_block_to_file`, `close()` index block), `EntryValue::MediumRaw` carries checksum - `static_sorted_file.rs` — read path (`get_raw_block_slice`, `read_block` with `verify_checksum()` helper) - `lookup_entry.rs` — `LazyLookupValue::Medium` carries checksum for compaction - `bin/sst_inspect.rs` — updated block parsing offsets - `README.md` — documented new format **Tests:** - All 48 existing tests pass (checksums are transparent to correct data) - 2 new corruption tests: one corrupts a compressed block's checksum, the other corrupts an uncompressed block's data — both verify the error is returned
Author
Parents
Loading