uv
5621c414 - Use symlinks for directories entries in cache (#1037)

Commit
1 year ago
Use symlinks for directories entries in cache (#1037) ## Summary One problem we have in the cache today is that we can't overwrite entries atomically, because we store unzipped _directories_ in the cache (which makes installation _much_ faster than storing zipped directories). So, if you ignore the existing contents of the cache when writing, you might run into an error, because you might attempt to write a directory where a directory already exists. This is especially annoying for cache refresh, because in order to refresh the cache, we have to purge it (i.e., delete a bunch of stuff), which is also highly unsafe if Puffin is running across multiple threads or multiple processes. The solution I'm proposing here is that whenever we persist a _directory_ to the cache, we persist it to a special "archive" bucket. Then, within the other buckets, directory entries are actually symlinks into that "archive" bucket. With symlinks, we can atomically replace, which means we can easily overwrite cache entries without having to delete from the cache. The main downside is that we'll now accumulate dangling entries in the "archive" bucket, and so we'll need to implement some form of garbage collection to ensure that we remove entries with no symlinks. Another downside is that cache reads and writes will be a bit slower, since we need to deal with creating and resolving these symlinks. As an example... after this change, the cache entry for this unzipped wheel is actually a symlink: ![Screenshot 2024-01-22 at 11 56 18 AM](https://github.com/astral-sh/puffin/assets/1309177/99ff6940-5096-4246-8d16-2a7bdcdd8d4b) Then, within the archive directory, we actually have two unique entries (since I intentionally ran the command twice to ensure overwrites were safe): ![Screenshot 2024-01-22 at 11 56 22 AM](https://github.com/astral-sh/puffin/assets/1309177/717d04e2-25d9-4225-b190-bad1441868c6)
Author
Parents
Loading