next.js
e22988e5 - Turbopack: switch chunk/asset hashes from hex to base40 encoding (#91137)

Commit
10 days ago
Turbopack: switch chunk/asset hashes from hex to base40 encoding (#91137) ### What? Switch Turbopack's hash encoding for chunk and asset output filenames from hexadecimal (base16) to base40, using the alphabet \`0-9 a-z _ - ~ .\`. Version hashes (used for HMR update comparison, not filenames) use base64 instead. ### Why? Base40 encodes the same number of bits in fewer characters than hex, producing shorter output filenames. All 40 characters are RFC 3986 unreserved (URL-safe) and safe on case-insensitive filesystems (macOS HFS+/APFS, Windows NTFS). Hash truncation lengths are reduced proportionally to maintain equivalent collision resistance: | Context | Before (hex) | After (base40) | Entropy | |---|---|---|---| | Content hash in chunk filenames | 16 chars | 13 chars | ~69 bits | | Content hash in asset filenames | 8 chars | 13 chars | ~69 bits | | Ident disambiguator hash | 8 chars | 7 chars | ~37 bits | | Long-path prefix hash | 5 chars | 4 chars | ~21 bits | ### How? **New encoding module** (\`turbo-tasks-hash/src/base40.rs\`): - Defines the base40 alphabet and length constants (\`BASE40_LEN_64 = 13\`, \`BASE40_LEN_128 = 25\`) - Implements a generic \`encode_base40_fixed<N>\` helper to avoid duplication - Public API: \`encode_base40(u64) -> String\` and \`encode_base40_128(u128) -> String\` **New base64 encoding** (\`turbo-tasks-hash/src/base64.rs\`): - \`encode_base64(u64) -> String\` — 11-char base64 (no padding) for version hashes - Version hashes don't appear in URLs or filenames, so base64 is safe and shorter **New \`HashAlgorithm\` variants** (\`turbo-tasks-hash/src/lib.rs\`): - \`Xxh3Hash64Base40\` and \`Xxh3Hash128Base40\` added alongside existing hex variants - Existing hex variants kept for internal manifests and identifiers **\`ContentHashing\` moved to \`turbopack-core\`**: - Moved from \`turbopack-browser\` to \`turbopack-core/src/chunk/mod.rs\` so both \`BrowserChunkingContext\` and \`NodeJsChunkingContext\` can use it **Separate chunk vs asset content hashing**: - \`BrowserChunkingContext\`: \`content_hashing\` renamed to \`chunk_content_hashing\` (optional), new \`asset_content_hashing: ContentHashing\` field (non-optional, defaults to 13 chars) - \`NodeJsChunkingContext\`: new \`asset_content_hashing: ContentHashing\` field (non-optional, defaults to 13 chars) - Builder methods: \`use_content_hashing()\` renamed to \`chunk_content_hashing()\`, new \`asset_content_hashing()\` **Version hashes switched to base64**: - \`turbopack-nodejs/src/ecmascript/node/version.rs\` - \`turbopack-dev-server/src/html.rs\` - \`turbopack-browser/src/ecmascript/version.rs\`, \`merged/version.rs\`, \`list/version.rs\` **Other callers updated** (15 files across turbopack and next-core): - All chunk/asset content hashing switched from \`Xxh3Hash128Hex\` → \`Xxh3Hash128Base40\` - \`ContentHashing::Direct { length }\` reduced from 16 → 13 - Asset path truncations use full 13-char base40 hash (matching chunk filenames) **Exception — \`wasm_edge_var_name\`** (\`turbopack-wasm/src/lib.rs\`): - Kept as \`Xxh3Hash128Hex\` because the hash is used as part of a JavaScript variable name (\`wasm_{hash}\`), and base40 characters \`-\`, \`~\`, \`.\` are not valid JS identifier characters. **Scope — NOT changed:** - Webpack configuration (unchanged) - Internal manifests (\`routes_hashes_manifest\`, \`project_asset_hashes_manifest\`) - Internal identifiers (font naming, external module hashing, data URI sources, debug IDs) - SRI hashes (SHA-based Base64, different purpose) --------- Co-authored-by: Vercel <vercel[bot]@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>
Author
Parents
Loading