next.js
91d78d10 - [turbopack] Create a simple benchmark to measure the overhead of turbotasks (#82982)

Commit
114 days ago
[turbopack] Create a simple benchmark to measure the overhead of turbotasks (#82982) This benchmark works by * creating a simple busy wait function to simulate cpu bound work - This is mostly to prove that we are measuring overheads and not some feature of small tasks. The overheads for a 1us task are ~identical to the overheads for a 10us task * running that function directly, as an uncached turbotask as a cached turbo task hitting one key and as a cached turbotask hitting many different cached keys This allows us to independently estimate the error of the busy loop, the overhead of launching and executing a task and the overhead of getting a cache hit. I should note, these are _idealized_ conditions and we should expect overheads to be higher during an actual build. This is because there is no actual contention for CPU resources (no task queueing), the internal hashmapsare small so there is less time waiting to fill caches from RAM, and there are no parallel allocations/deallocations occurring (contenting in mimalloc data structures). On my machine i see: Cache-Hit (all same key): 130ns-140ns Cache-Hit (all different keys): 240ns-500ns Turbo-Task-Execution Overhead: 4-8us The fact that the cost of a cache hit is so different depending on if we are hitting the same key or different keys is interesting. I would assume that this this is the cost of missing cpu caches (~100ns to fill a cache line from ram), this implies we are paying for the cost of an extra 2-3 dependent ram reads (which in retrospect sounds like a DashMap to me!) This implies a 'break even' equation for caching Given a task that takes `Tns` of time caching becomes worth it after this many executions `ceil((TOverhead-TCache + TTask) / (TTask - TCache))` plugging in our values we get the following results. In the optimistic case for caching * T <= 130ns: never worth it * T = 1000ns (1us): 7 executions (1 actual, 7 cache hits) * T >= 10μs: 2 executions (aka need at least one cache hit) In the pessimistic case for caching * T <= 500ns: never worth it * T = 1000ns (1us): 11 executions (1 actual, 10 cache hits) * T >= 10μs: 2 executions So really we should be suspicious of all tasks that never get a hit and all tasks faster than 10us need many hits to work.
Author
Parents
Loading