Re: open() behavior under heavy disk load
- From: Rainer Weikusat <rweikusat@xxxxxxxxxxx>
- Date: Fri, 16 Nov 2007 21:16:52 +0100
David Schwartz <davids@xxxxxxxxxxxxx> writes:
On Nov 14, 3:44 am, Rainer Weikusat <rweiku...@xxxxxxxxxxx> wrote:
The problem is that people who simply don't care 'how to use it' can
create more severe issues with threads than without them. And since
(for instance) effects which may be created because of race conditions
are hardly demonstrable and can be understood as 'low probability
failure condition' it is very conceivable that a great many people
will not care for them.
Except that approaches that don't use threads have the same 'low
probability failure condition's that are very hard to detect and
fix.
They don't. At best, there could be 'similar issues'. No
single-threaded program will ever experience potentially severe data
corruption because write access to something shared among different
threads wasn't explicitly serialized and I personally know people who
(among other things) do programming and would never serialize
anything, because the chances that it actually breaks are (in their
opinion) so slim that it simply wouldn't be worth the typing (after
all, everything breaks sometimes and needs to be fixed afterwards).
A good example is rarely-used code triggering a code page fault which
results in load backing up. The load backup means that data is not
still in cache from when it was received when it's processed. Thus the
server never recovers from the backup.
This does not really relate to 'race conditions', page faults occur in
all code, not only in single-threaded code and a system where constant
responsiveness is that critical that a single page fault occuring
during execution will result in a permanent backlog of jobs to be
processed should use techniques suitable to deal with that. Examples
would be memory locking or simply dropping the backlog (such a system
must be a realtime system, meaning, processing of 'old events' is much
less usefull than processing of 'current' events).
Bad programmers will do anything badly. Good programmers will do
anything well.
This is a somewhat simplistic view. An example of a deadlock I
recently occured when a refence-counted data structure was supposed to
be garbage collected because of inactivity from the appliance it
represented while there was still an active transmission for this
particular destination. The data structure used to hold state for a
reliable (ie retransmitted until acknowledgment) transmission held a
reference to the first data structure, too, because replies could come
in at any time, especially, after the garbage collector had dropped
his reference and before he got around to disable processing of
them. The idea behind the code was that the 'freeing routine', if it
encounters (after decrementing) a reference count of one, would check
if there is an active transmission attempt and then cancel the
transmission. To do so, it again incremented the reference count and
called a 'cancel this transmission routine'. Unfortunately, this
caused the second invocation of the freeing routine to again find the
reference count at one, and consequently, it would attempt to lock the
config transaction queue structure again to determine if there was an
active transaction. Which lead to all threads in the process
deadlocking really fast. Obvious in hindsight, the mock reference
count needs to be incremented twice to prevent this from happening.
But 'obvious in hindsight' is something rather different from just
writing the code fast, because of the usual time pressure, testing
that it 'generally works' and relying, because of the same time
pressure, on the assumption that one rarely does something really
stupid.
Odd corner bug cases are not unique to threads.
The effects of 'race conditions when accessing shared data' are
(Mostly. Processes can do so, too). And my original statement was just
that I strongly suspect that some people would chose to ignore the
possibilty, counting on the probable assumption that nobody will ever
have the time to check the code in detail again, but just fix whatever
was broken this time to get the thing online/ working again fast (this
level of malice is somewhat hypothetical, because 'being thoughtless'
is different from 'being malicious').
.
- References:
- Re: open() behavior under heavy disk load
- From:
- Re: open() behavior under heavy disk load
- From: David Schwartz
- Re: open() behavior under heavy disk load
- From: Rainer Weikusat
- Re: open() behavior under heavy disk load
- From: David Schwartz
- Re: open() behavior under heavy disk load
- Prev by Date: adding to /proc/pid filesystem
- Next by Date: Re: module license taints kernel.
- Previous by thread: Re: open() behavior under heavy disk load
- Next by thread: USB SET_DESCRIPTOR_TO_DEVICE ??
- Index(es):
Relevant Pages
|