Re: Obscure mutex problem
- From: David Given <dg@xxxxxxxxxxx>
- Date: Wed, 05 Sep 2007 20:43:52 +0100
Ulrich Eckhardt wrote:
[...]
I think logfiles are the wrong way. Rather, make them connect to the process
using gdb and give you stack traces of all threads.
Alas, my users aren't particularly technically competent. I did walk them
through this a while ago and got some backtraces, but I didn't see anything
that I didn't know about already from log inspection --- all the threads had
blocked exactly where I thought they would be. As the server thread had never
started up, no I/O got scheduled, which meant that the whole thing had deadlocked.
I should point out that the *only* difference between the version running in
the foreground (which worked) and the version running in the background (which
didn't work) was a single call to daemon(0, 0). Same binary, same configuration.
[...]
This is definitely not true on a multi-CPU machine, where two threads can
run at the same time. Keep that in mind!
This is what the lock's for --- it enforces a single flow of execution. (As I
said before, this is a coroutine implementation.)
[...]
Well, from looking at it, just some notes:
- uses macros for creating loops, fails to avoid double evaluation of macro
parameters
- uses a nonportable GCC extension (typeof)
Both of these are actually a historical artifact and are no longer used.
- actively hiding errors by e.g. casting the returnvalue of
pthread_key_create
Can never fail in that situation.
- catching exceptions by value instead of reference-to-const
This should affect performance only, not behaviour.
- dereferencing a pointer after invoking 'delete' on it
If you're talking about:
delete threadlet;
threadlet->releaseCPUlock();
...then it's safe because releaseCPUlock() is static. (Yes, I appreciate that
it's badly phrased. I should change it to Threadlet::releaseCPUlock().) If
you're talking about something else, please let me know!
- should use RAII for managing locks
Unnecessary in this case because I only have one lock and it doesn't need
managing (it exists for the entire lifetime of the application).
- you are starting a thread that uses a virtual function of an object that
might not be fully constructed
The thread main function will immediately block on the mutex, preventing it
from reaching the virtual function until after the threadlet constructor
completes (and reaches the stage where some I/O happens).
- as someone noted, you are not controlling compiler-generated copy
constructor and assignment operator
Unnecessary in this case because threadlet objects are never assigned to or
copied... but I've disabled them Just To Be Sure, to no effect.
So these issues, while possibly bad style --- and yeah, there's stuff in there
that does need tidying --- shouldn't have any actual effect on the *behaviour*
of the program. So, thanks for the analysis (always valuable!) it still
doesn't get me any closer to what's actually going wrong!
--
┌── dg@cowlark.com ─── http://www.cowlark.com ───────────────────
│
│ "There does not now, nor will there ever, exist a programming language in
│ which it is the least bit hard to write bad programs." --- Flon's Axiom
.
- Follow-Ups:
- Re: Obscure mutex problem
- From: Ulrich Eckhardt
- Re: Obscure mutex problem
- References:
- Obscure mutex problem
- From: David Given
- Re: Obscure mutex problem
- From: Ulrich Eckhardt
- Obscure mutex problem
- Prev by Date: Re: Obscure mutex problem
- Next by Date: Re: Obscure mutex problem
- Previous by thread: Re: Obscure mutex problem
- Next by thread: Re: Obscure mutex problem
- Index(es):