Re: Blocking queue race condition?



Andrew Tomazos wrote:
Hey all,

I'm trying to implement a high performance blocking queue backed by a
circular buffer on top of pthreads, semaphore.h and gcc atomic
builtins. The queue needs to handle multiple simulataneous readers
and writers from different threads.

I've isolated some sort of race condition, and I'm not sure if it's a
faulty assumption about the behavior of some of the atomic operations
and semaphores, or whether my design is fundamentally flawed.

I've extracted and simplified it to the below standalone example. I
would expect that this program never returns. It does however return
after a few hundred thousand iterations with corruption detected in
the queue.

In the below example (for exposition) it doesn't actually store
anything, it just sets to 1 a cell that would hold the actual data,
and 0 to represent an empty cell. There is a counting semaphore
(vacancies) representing the number of vacant cells, and another
counting semaphore (occupants) representing the number of occupied
cells.

Writers do the following:
(1) decrement vacancies
(2) atomically get next head position
(3) write to it
(4) increment occupants

Readers do the opposite:
(1) decrement occupants
(2) atomically get next tail position
(3) read from it
(4) increment vacancies

I would expect that given the above, precisely one thread can be
reading or writing any given cell at one time.

Any ideas about why it doesn't work or debugging strategies
appreciated. Code and output below...

I tested the code a bit and it seems that it works alright
if NUM_TREADS is one. So the race is probably between
several readers or several writers.

it also works if you use a mutex like this(in both take _and_ put):

pthread_mutex_lock(&q_mx);
tmp=tail++;
get( tmp %QUEUE_CAPACITY);
pthread_mutex_unlock(&q_mx);

but _not_ like this(which is equivalent to the atomic ops code):

pthread_mutex_lock(&q_mx);
tmp=tail++;
pthread_mutex_unlock(&q_mx);
get( tmp %QUEUE_CAPACITY);

try to run inside gdb
use 'break file.cpp:<lineno>'
to set a breakpoint at the corruption
'info threads' lists your threads
'tread n' switches between them
'backtrace' shows where you are
help gives help
'p <expr>' prints _any_ expression (even with side effects)

Good Luck
JK
.



Relevant Pages

  • Blocking queue race condition?
    ... The queue needs to handle multiple simulataneous readers ... and 0 to represent an empty cell. ... CountingSemaphore(unsigned int initial) ... void post() ...
    (comp.os.linux.development.system)
  • Blocking queue race condition?
    ... The queue needs to handle multiple simulataneous readers ... and 0 to represent an empty cell. ... CountingSemaphore(unsigned int initial) ... void post() ...
    (comp.os.linux.misc)
  • Blocking queue race condition?
    ... The queue needs to handle multiple simulataneous readers ... and 0 to represent an empty cell. ... CountingSemaphore(unsigned int initial) ... void post() ...
    (comp.os.linux.development.apps)
  • Re: Adding files to a print queue
    ... in each cell. ... >> Dim sh as Worksheet ... >> Regards, ... >>> to the queue. ...
    (microsoft.public.excel.programming)
  • Re: Novus, Booty, This Ones For You
    ... mediated by any Cell. ... propagate, by supplying a changedp predicate that always answers "yes", and a rule can return a second value of:propagate so that this slot of this instance always propagates. ... Does execution of this SETF "hang" synchronously while the rest of X ... There's that word "queue" again. ...
    (comp.lang.lisp)