Initial tracing implementation (#4966)
### Description
* adds a raw_trace subscriber that will emit the raw trace events into
`.turbopack/trace.log` (resp. `.next/trace.log` for next.js)
* adds a CLI script which converts the raw trace into a chrome trace
json file -> https://ui.perfetto.dev/
* adds a `TURBOPACK_TRACING` (resp. `NEXT_TURBOPACK_TRACING`) env var to
enable tracing
* adds some presets e. g. `turbopack` or `next` to enable tracing for
certain things.
* add tracing for invalidations
There are three different visualization modes:
#### `--single`
Shows all cpu time as it would look like when a single cpu would execute
the workload.
(10 concurrent tasks that take 1s are shown as 10 tasks that take 1s
with total time of 10s)
Pro:
* Concurrency is visualized by bar filling (dark filled bars -> too few
concurrency)
* It injects pseudo bars with "cpus idle" for low concurrency (with
`--idle`)
Con:
* Total time won't be represented correctly, since a single CPU would
take longer
Use Case: Gives a good overview of slow tasks in a build.
#### `--merged`
Shows all cpu time scaled by the concurrency.
(10 concurrent tasks that take 1s are shown as 10 tasks that take 0.1s
with total time of 1s)
Pro:
* Total time is represented correctly
* Low concurrent tasks are bigger
* Concurrency is visualized by bar filling (dark filled bars -> too few
concurrency)
* It injects pseudo bars with "cpus idle" for low concurrency (with
`--idle`)
Con:
* Individual tasks time won't be represented correctly.
Use Case: Gives a good overview why a build is slow overall.
#### `--threads`
Shows cpu time distributed on infinite virtual cpus/threads.
(10 concurrent tasks that take 1s are shown as 10 concurrent tasks that
take 1s with total time of 1s)
Pro:
* Concurrency is shown via multiple CPU
* Most realistic visualization
Con:
* Hard to read