llvm-project
10ade366 - [libc] Rework slab cache data structure for GPU allocator

Commit
1 day ago
[libc] Rework slab cache data structure for GPU allocator Summary: This was previously a Trieber stack, which is a perfectly fine generic and lock-free data structure. However, this used some expensive CAS operations and had issues with ABA. Because the only user of this was the slab cache mechanism, we can pretty safely specialize it. Instead, we simply search a fixed size buffer for some sentinal values and CAS into it. For allocations that only ever hit the cache, this improves performance from ~9000 cycles to ~6000 cycles and similar improvements for workloads that feel the pain of small thread counts hitting the cache.
Author
Committer
Parents
Loading