Paper: https://ieeexplore.ieee.org/document/7967181/references#references
The minimum queue’s requirements are:
enqueue
dequeue
Goal: Maximize speed and limit synchronization overheads.
N
.r1
or at cell index by r1 mod N
.r2
where r2 < r1
and r2 mod N = r1 mod N
.cell
s. An empty cell has no data, its rank and gap set to sentinel.tail
and head
to 0
.cell
at tail
.cell
’s rank is not a sentinel, that means the cell
is already occupied:
cell
’s gap to tail
, which is the current rank.tail
by 1, essentially skipping the current rank.cell
is not occupied i.e its rank is a sentinel:
cell
’s data.cell
’s rank to tail
, which is the current rank.tail
by 1.head
to get the rank
of the next item.cell
corresponding to rank
.cell
’s rank equals rank
, that means the cell indeed stores the item with rank head
:
cell
’s gap >= rank
and the cell
’s rank != rank
, that means this cell was skipped:
head
and update rank
to the next rank.cell
to the next cell.C1
while consumer 2 tries to access cell of rank C1 + N
which maps to the same cell. At most one consumer may modify the cell at a time.-> Without memory fences, this algorithm would cause undefined behavior with the C++ memory model. The implementation uses various compiler extensions anyways to this argument doesn’t hold.
enqueue
is achieved by allowing the producer to skip a cell under the assumption that eventually the producer would find an emtpy cell. If this assumption is violated, the producer would spin for a long time and enqueue
is no longer wait-free.dequeue
is not wait-free as if the producer suspends, all the consumers must wait and make no progress, however, consumers do not block each other.cell
’s rank != rank
in step 3.2 in dequeue
is to avoid a slow consumer being idle during the time gap between step 3.1 and 3.2 and a producer in the mean time has enqueued an item ranked rank
, did a full circle, skipped some cells and ended up updating the current cell
’s gap >= rank
.tail
to obtain the current rank
.cell
corresponding to the current rank
.cell
’s gap into g
.g >= rank
then this cell’s has been taken, break out of the loop.cell
’s rank into r
.r >= 0
then this cell has been used, skip it by performing an atomic double-compare-and-swap to:
cell
’s gap to rank
.cell
’s rank to r
.If both:
cell
’s gap == g
.cell
’s rank == r
.Reloop.
cell
’s gap to g
.cell
’s rank to another sentinel with reserve semantic.If both:
cell
’s gap == g
.cell
’s rank == sentinel.If this operation fails, reloop.
Otherwise:
cell
’s data.cell
’s rank to rank
.pthread
.Remark: We have to resort to platform-specific & compiler extensions. Can we relying on the portable C++ memory model?
-> Simple benchmark.
Benchmarks are written in C and compiled using gcc with optimization -O3
.
enqueue
and dequeue
operations on a MPMC single queue. Between two operations, an arbitrary delay is added to avoid a cache line being held by one thread for a long time.