Re: Obscure mutex problem



On Sep 4, 4:43 pm, David Given <d...@xxxxxxxxxxx> wrote:

The way the program works is as follows: there are multiple threads of
execution, where normally only one is running at a time. This is controlled by
a single mutex that's normally locked. All other threads block on this mutex.
During I/O, or any other function that ought to take place in the background,
the mutex is released, so allowing another thread to run. (The reason for the
odd design is that it's a coroutine implementation.) The mutex is a perfectly
ordinary standard mutex that's initialised from the main thread before any
child threads are created. All the work happens in the child threads; the main
thread just sleeps.

So each child thread looks like this:

1) Acquire the mutex.
2) Do work that we don't expect to block.
3) Release the mutex, do blocking I/O, re-acquire the mutex.
4) Go to step 2.

Is that correct?

One way this can break horribly is if things have changed when you re-
acquire the mutex. When you have no choice but to use a pattern like
this, you generally need to add a step 3-1/2 like this:

3.5) Make sure nothing changed while the mutex was released. Re-
validate pointers, and if something we were working on has changed,
abandon or restart this operation.

- in studying the strace output, I notice that there doesn't seem to be a call
to futex() corresponding to the initial pthread_mutex_lock() from the main thread.

Calls to 'futex' are only needed to block or wake a task. They are not
needed to lock or unlock the futex.

- when initially writing the code, I discovered that I had to set
PTHREAD_MUTEX_RECURSIVE to make the mutexes work at all... but then later
found this was no longer necessary. No doubt this is due to another change I
made, but I still don't understand it.

That must have been due to a bug in your code. Too bad you didn't
track it down and fix it when you had the chance. If you had, you
might have averted this problem.

- my test machines include an ARM-based NSLU2 with linuxthreads (one kernel
process per thread) and a i386-based PC with NPTL (real kernel threads). It
works fine on both of these.

Neither LinuxThreads nor NPTL use one kernel process per thread and
they both use real kernel threads. A process, by definition, has its
own address space. Threads, by definition, are parts of the same
process.

DS

.



Relevant Pages

  • Obscure mutex problem
    ... The problem is that on all the machines I have access to, ... All other threads block on this mutex. ... All the work happens in the child threads; ... rewriting the mutex initialisation code to use a statically initialised ...
    (comp.os.linux.development.apps)
  • Re: [PATCH 2.6.19-rt12][RFC] - futex_requeue_pi implementation (requeue from futex1 to PI
    ... The PI-futex uaddr2 is flagged with it. ... If 1) The mutex is a PI-mutex ... Then we consider we own the futex and the can lock the mutex ... On kernel side, in futex_wake_pi, the futex ownership is given by anticipation to "what should be" the woken thread. ...
    (Linux-Kernel)
  • Re: Documentation NPTL
    ... barriers in your mutex, or any other mutex. ... I have a patent on a proxy collector that can do this and of course there is ... I can use the lock-free eventcount as a futex. ...
    (comp.programming.threads)
  • [PATCH]Futex doc.
    ... Mutex stands for mutual exclusion.In other words, ... against shared memory access for instance. ... Futex is basically a counter whose value is atomically updated in userland. ... send the line "unsubscribe linux-kernel" in ...
    (Linux-Kernel)
  • Re: [PATCH 5/5] futex: fix miss ordered wakeups
    ... "If there are threads blocked on the mutex object referenced by mutex ... The key is "scheduling policy" .. ... to priority invert some tasks than it is for the futex to do it. ...
    (Linux-Kernel)