System image compression with zstd (#59227)
Revived version of #48244, with a slightly different approach. This
version looks for a function pointer called `jl_image_unpack` inside
compiled system images and invokes it to get the `jl_image_buf_t`
struct. Two implementations, `jl_image_unpack_zstd` and
`jl_image_unpack_uncomp` are provided (for comparison). The zstd
compression is applied only to the heap image, and not the compiled
code, since that can be shared across Julia processes.
TODO: test a few different compression settings and enable by default.
Example data from un-trimmed juliac "hello world":
```
156M hello-uncomp
43M hello-zstd
48M hello-zstd-1
45M hello-zstd-5
43M hello-zstd-15
39M hello-zstd-22
$ hyperfine -w3 ./hello-uncomp
Benchmark 1: ./hello-uncomp
Time (mean ± σ): 74.4 ms ± 0.8 ms [User: 51.9 ms, System: 19.0 ms]
Range (min … max): 73.0 ms … 76.6 ms 39 runs
$ hyperfine -w3 ./hello-zstd-1
Benchmark 1: ./hello-zstd-1
Time (mean ± σ): 152.4 ms ± 0.5 ms [User: 138.2 ms, System: 12.0 ms]
Range (min … max): 151.4 ms … 153.2 ms 19 runs
$ hyperfine -w3 ./hello-zstd-5
Benchmark 1: ./hello-zstd-5
Time (mean ± σ): 154.3 ms ± 0.5 ms [User: 139.6 ms, System: 12.4 ms]
Range (min … max): 153.5 ms … 155.2 ms 19 runs
$ hyperfine -w3 ./hello-zstd-15
Benchmark 1: ./hello-zstd-15
Time (mean ± σ): 135.9 ms ± 0.5 ms [User: 121.6 ms, System: 12.0 ms]
Range (min … max): 135.1 ms … 136.5 ms 21 runs
$ hyperfine -w3 ./hello-zstd-22
Benchmark 1: ./hello-zstd-22
Time (mean ± σ): 149.0 ms ± 0.6 ms [User: 134.7 ms, System: 12.1 ms]
Range (min … max): 147.7 ms … 150.4 ms 19 runs
```
---------
Co-authored-by: Gabriel Baraldi <baraldigabriel@gmail.com>