|
Thread System 0.3.1
High-performance C++20 thread pool with work stealing and DAG scheduling
|
Thread System ships several lock-free queues for cases where mutex-based synchronization is too expensive. This tutorial explains how to choose between the available types, how hazard pointers reclaim memory safely, and what performance characteristics to expect.
A "lock-free" queue makes progress without acquiring a mutex. Producers and consumers coordinate through atomic operations on shared state. Lock-free queues are not always faster than mutex queues — they shine under high contention with short critical sections, while a well-tuned mutex queue may beat them when contention is low.
Thread System provides:
adaptive_job_queue — Auto-switches between mutex and lock-free modes based on observed contention. Use this by default.lockfree_job_queue — Multi-producer / multi-consumer (MPMC) Michael-Scott queue with hazard pointer reclamation.work_stealing_deque — Single-producer / multi-consumer Chase-Lev deque used by the work-stealing scheduler.| Property | MPMC (lockfree_job_queue) | SPSC (ring buffer) |
|---|---|---|
| Producers | Many | Exactly one |
| Consumers | Many | Exactly one |
| Throughput | High under contention | Highest possible |
| Latency | Bounded | Lowest |
| Memory model | Hazard pointers | No reclamation needed |
| Bounded? | Optional | Always |
Rules of thumb:
Lock-free queues cannot free a node immediately after popping it because another thread might still be reading the same node. Hazard pointers solve this by letting each thread publish the addresses it is currently reading. A node is only freed once no thread holds a hazard pointer to it.
Thread System's hazard pointer subsystem is mostly transparent — the lockfree queue manages slots automatically. If you implement your own lock-free structure, the API looks like this:
Each thread holds a small fixed array of hazard slots (4 by default). Retired nodes accumulate on a per-thread retire list and are scanned periodically; nodes whose addresses do not appear in any thread's hazard array are freed in batch.
Common pitfalls:
Approximate behavior under typical x86_64 conditions (your numbers will vary):
Memory:
When in doubt, run the benchmarks/queue_benchmark target on your target hardware before committing to a specific queue type.