[Turbopack] Use a presized scratch buffer for task encoding (#88924)
### What
Pass a shared buffer to use as scratch space for encoding TaskStorage values.
Also reduce the size of the `CollectorEntryValue` enum (32->24 bytes), our use of a `SmallVec` was inefficient, instead we can store more inline data by doing it ourselves (22 bytes instead of 16)
### Why
Currently whenever we encode `TaskStorage` we allocate a new `TurboBincodeBuffer` (aka `SmallVec<[u8;16>`) only the very smallest TaskStorage values fit in that space, so we are always allocating and resizing a buffer for every `TaskStorage` we encode. Using a shared scratch buffer we avoid resizes and allocations during encoding but now always need to copy our data out of the shared buffer.
This should reduce temporary allocations and buffer copies from the resizes in the common case. As well as ensure that the buffers we pass through to the collectors are exactly sized
I presized it as 4096 bytes since this covers ~98% of tasks we encode, and it is a nice magic number
A future optimization could be to accumulate all writes for every SnapshotShard in a single buffer that we pass down to the `Collector`.