Re: [RFC, PATCH] locks: remove posix deadlock detection



On Sun, Oct 28, 2007 at 06:40:52PM +0000, Alan Cox wrote:
On Sun, 28 Oct 2007 12:27:32 -0600
Matthew Wilcox <matthew@xxxxxx> wrote:

On Sun, Oct 28, 2007 at 01:43:21PM -0400, J. Bruce Fields wrote:
We currently attempt to return -EDEALK to blocking fcntl() file locking
requests that would create a cycle in the graph of tasks waiting on
locks.

This is inefficient: in the general case it requires us determining
whether we're adding a cycle to an arbitrary directed acyclic graph.
And this calculation has to be performed while holding a lock (currently
the BKL) that prevents that graph from changing.

It has historically been a source of bugs; most recently it was noticed
that it could loop indefinitely while holding the BKL.

It can also return -EDEADLK spuriously. So yeah, just kill it.

NAK. This is an ABI change. It was also comprehensively rejected before
because

- EDEADLK behaviour is ABI
- EDEADLK behaviour is required by SuSv3
- We have no idea what applications may rely on this behaviour.

On the second point I think you're mistaken; see

http://www.opengroup.org/onlinepubs/009695399/functions/fcntl.html

"Since implementation of full deadlock detection is not
always feasible, the [EDEADLK] error was made optional."

The third objection is the one that concerns me most, and is the one I'd
like to understand better. So I'd be interested in any evidence of
applications that do actually depend on the current behavior. (Even
hypothetical use cases might be interesting.) I'm also curious what
other OS's do.

See the thread
http://osdir.com/ml/file-systems/2004-06/msg00017.html

so we need to fix the bugs - the lock usage and the looping. At that
point it merely becomes a performance concern to those who use it, which
is the proper behaviour. If you want a faster non-checking one use
flock(), or add another flag that is a Linux "don't check for deadlock"

That's an interesting idea. The flag might not help as much, since
looking for a cycle in the graph of blocked requests may be more
difficult in the case where we don't know whether the graph is acyclic
to start out with. (But I don't know--I haven't thought about it much.)

--b.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [RFC, PATCH] locks: remove posix deadlock detection
    ... whether we're adding a cycle to an arbitrary directed acyclic graph. ... And this calculation has to be performed while holding a lock (currently ... EDEADLK behaviour is required by SuSv3 ...
    (Linux-Kernel)
  • Re: 2.6.11-rc2-mm1
    ... > o Annotate code with the lock that could be taken. ... > o Create a graph from the locking hierarchy obtained above. ... the graph surely has cycles. ... send the line "unsubscribe linux-kernel" in ...
    (Linux-Kernel)
  • Re: deadlock detection
    ... filename, the function of the lock, and the offset of the requested ... I am using one mutex for graph edits, ... I did not see anything in your links that looked like a graph ... but the guts of the cycle detector are only ...
    (comp.programming)
  • Re: Intermitent IMediaControl::Stop problem
    ... > There's also a worker thread owned by the filter graph control code which is ... > the code to gain the lock pumps messages while waiting for the lock. ...
    (microsoft.public.win32.programmer.directx.video)
  • counting cycles
    ... :> its meaning is perfectly clear and non-controversial. ... If you try to graph it, ... Except that this cycle IS infinite. ... It's hard to know what you even COULD mean by an infinite cycle. ...
    (sci.logic)