Re: device struct bloat
- From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
- Date: Tue, 06 Nov 2007 18:19:26 +0100
On Tue, 2007-11-06 at 11:32 -0500, Alan Stern wrote:
On Tue, 6 Nov 2007, Peter Zijlstra wrote:
On Tue, 2007-11-06 at 10:36 -0500, Alan Stern wrote:
You can't possibly solve the lockdep problems here with a simple-minded
approach like your DRIVER_NORMAL, DRIVER_PARENT, etc. The device tree
is arbitrarily deep & wide, and there is at least one routine that
acquires the semaphores for _all_ the devices in the tree.
*blink* someone needs to take all locks - why?
It's for system suspend. All the devices are locked to prevent driver
binding and unbinding during the suspend transition.
Just locking the tree root is not enough? (thereby avoiding all any new
modifying operation to descend into the tree).
Even apart from that, there are places where the USB core needs to
acquire multiple layers of locks (more than just two). For example, if
somebody has hubs nested several deep and then unplugs the hub closest
to the computer, this will happen.
Pin the sub-tree root with a reference, iterate the sub-tree and delete
the leafs one by one?
Shucks, any time a driver's probe routine tries to register a child
device you will run into problems. probe() is always called with the
device locked, and registration of a child will acquire the child's
lock in order to probe drivers for the child.
Right, so you say you have unbounded stack recursion? How about pushing
each probe into a stack (workqueue perhaps). Then each stack entry would
only do a single level probe, if it would recurse push a new entry on
the stack. Basic recursive -> stack transform.
This fact
alone seems to preclude using lockdep for device locks. (If there was
a form of mutex_lock() that bypassed the lockdep checks, you could use
it and avoid these issues.)
I'm sceptical of this, since its a simple tree (as opposed to a balanced
tree) a simple lock-coupling approach should be enough to guarantee
consistency.
I don't follow your reasoning. Let's say a task wants to lock all the
direct children of a particular device. In what order should the locks
be acquired?
I'd first start by asking if you want to lock all the children or the
sub-tree. The latter can be done by locking the root of said sub-tree.
There's no natural tree-related ordering to follow.
Of course there is a natural deadlock free locking order in a tree: lock
the parent, lock all its children, repeat by noting that each child is a
parent again. If you only ever lock top down, there should not be any
lateral dependencies.
In the simple case where locks are acquired along a path in the tree,
you could solve the lockdep issues by passing mutex_lock_nested() a key
equal to the device's depth in the tree. But that won't work with more
complicated cases.
And only up to 8, as that's the max nesting depth.
Deadlock is a serious consideration. For the most part, routines
locking devices do so along a single path in the tree. For this simple
case the rule is: Never acquire a parent's lock while holding the
child's lock.
Sure, but once you have a parent's lock, you could unlock your
grandparent, no? (it having a locked child, your parent, should be
enough to guarantee its continued existence)
Of course. So what? In many cases the code needs to keep all three
locks. The locks don't merely guarantee the device structs' continued
existence (a kref could do that) -- they serialize a number of related
operations.
Right, but does the grandparent really need serialisation? Normal simple
tree manipulations don't require more than 2 locks to guarantee tree
consistency. So you either don't have a simple tree, or you have more
requirements.
The routine that locks all the devices acquires the locks in order of
device registration. The idea here is that children are always
registered _after_ their parents, so this should be compatible with the
previous rule. But there is a potential problem: device_move() can
move an older child under a younger parent!
Seems like a weird rule, a typical tree locking rule would be to lock
them top-down - such a rule can easily cope with moves: lock old parent,
lock child, lock new parent, move child, unlock all in reverse order.
Yep. But this isn't a typical tree-locking. System suspend sometimes
requires that devices be powered down in a specific order, and there
are constraints not captured by the positions in the tree. We get
around this by powering devices down in reverse order of detection,
which should always be safe. Although the locking need not necessarily
follow these same constraints, it is certainly the easiest approach.
Ah, there are more constraints than tree order (which as I get it
resembles bus order)? Which confuses me, as you cannot detect a leaf
before you have a parent, so top-down should still be good, no?
(And by the way, your example rule "lock old parent, lock child, lock
new parent" is subject to deadlocks. What if another task tries to
move a different device from under the new parent to the old parent at
the same time?)
D'0h, right you are. How about a delete, insert sequence?
Like said, I think the tree locking model should be revisited. A
top-down locking model with lock-coupling should, from my ignorant
perspective, solve your problems.
I don't understand what you mean by "lock-coupling".
A
B C
D E F G
In order to lock F we do:
lock A, lock C, unlock A, lock F, unlock C
Look at it this way, either you're more complex than the VFS or it can
be annotated :-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: device struct bloat
- From: Alan Stern
- Re: device struct bloat
- References:
- Re: device struct bloat
- From: Alan Stern
- Re: device struct bloat
- Prev by Date: [RFC] [PATCH 2/3] Recursive mtime for ext3
- Next by Date: [RFC] [PATCH 3/3] Recursive mtime for ext3
- Previous by thread: Re: device struct bloat
- Next by thread: Re: device struct bloat
- Index(es):
Relevant Pages
|