Re: deadlock in "modprobe -r ohci1394" shortly after "modprobe ohci1394"



Alan Stern wrote:
On Sun, 19 Nov 2006, Stefan Richter wrote:
@@ -1873,8 +1873,19 @@ static void nodemgr_remove_host(struct h
...
if (hi) {
+ /* up(&host->device.sem); --- apparently not required */
+ if (host->device.parent)
+ up(&host->device.parent->sem);
kthread_stop(hi->thread);
+ if (host->device.parent)
+ down(&host->device.parent->sem);
+ /* down(&host->device.sem); --- apparently not required */
nodemgr_remove_host_dev(&host->device);
}
}

Obviously this patch isn't pretty. It's also incorrect, because it
reacquires the parent's semaphore while holding the child's -- that's
another recipe for deadlock.

Knowing nothing at all about ieee1394, I get the feeling that the culprit
here is a strange subsystem design. In fact, I don't understand exactly
what's going wrong. Evidently the rmmod thread owns the locks for both
the host being removed and its parent, and it wants to stop knodemgrd,
which is waiting to acquire the host's parent's lock because it is
attempting to rescan the parent. Is that right?

That's right.

It doesn't make sense. If knodemgrd is rescanning the parent then the
parent must not have a driver. If it doesn't have a driver, how can it
have children?

Well, I don't fully understand the reasoning behind it. (But even less
do I grasp the driver core.) Short story: It may indeed be possible to
get rid of the parent relationship to the host device and of the
rescanning of the host device. Long story:

ieee1394 has:
- struct hpsb_host (contains a struct device and a struct class_device)
ieee1394's nodemgr in addition has:
- struct node_entry (ditto)
- struct unit_directory (ditto)

nodemgr also provides different .release callbacks for each of these
devices and class_devices, an .uevent callback for unit_directory
class_devices and the struct bus_type for the three kinds of device.

All of the existing protocol drivers bind only to unit_directories, so
these are the most important ones as far as driver matching is
concerned. (There is only a little out-of-tree driver, mem1394 for
forensics, which binds differently because it doesn't require any actual
protocol on a remote node.)

A hpsb_host is merely access to controllers; each one controls a bus. On
a bus live a few node_entries, as many as there are nodes with an active
link layer controller. This includes the local host, therefore there is
always at least one node_entry per bus. A node may implement 0, 1 or
more units and presents them in a hierarchical manner. A unit talks a
protocol; and our protocol drivers access such a unit and, if the
protocol has this necessity, the unit's node. (While doing so, it also
accesses the hpsb_host in front of the node.)

If support for eth1394 is configured in, our drivers add a unit to each
of the local nodes which indicate IP over 1394 capability to external
nodes. This unit_directory also matches the eth1394 when the driver core
scans the ieee1394_bus_type.

So in short, I think we could actually live with a minimum sysfs
implementation which exposes only unit directories. (But I may be
missing something --- I'm not quite familiar with the sysfs interfacing
tools, that is udev and hald.) At least device file naming of SBP-2
units wouldn't suffer because udev takes the unique name of SBP-2 units
from a sysfs attribute of the SCSI device, not from the unit_directory's
device.

The /sys/class/net/eth?/device links of eth1394 class devices currently
point to the hpsb_host devices i.e. fw-host devices; maybe they could as
well point to respective unit directories.

So maybe a simple, flat sysfs representation without node_entry devices,
perhaps also without fw-host devices --- or at least a representation
(1) without parent relationship to the fw-host devices and (2) without
ieee1394_bus_type in fw-host devices --- is possible without breaking
userspace. It probably would eliminate the problem which we are
discussing here. This would mean that we loose the ability to bind
protocol drivers to a hpsb_host (and depending on how far the
respresentation is cut down, also the ability to bind protocol drivers
to a node_entry). But as I said I see no actual need for this ability.
--
Stefan Richter
-=====-=-==- =-== =--==
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: deadlock in "modprobe -r ohci1394" shortly after "modprobe ohci1394"
    ... attempting to rescan the parent. ... If it doesn't have a driver, ... - struct hpsb_host ... All of the existing protocol drivers bind only to unit_directories, ...
    (Linux-Kernel)
  • Re: [PATCH 1/1] MFD: Add U300 AB3100 core support v2
    ... If you register sub-devices this way, they won't appear as ab3100 children. ... I'll set the parent in the first loop then because I ... after copying it to a local driver struct. ... + * This struct is PRIVATE and devices using it should NOT ...
    (Linux-Kernel)
  • Re: High DPC Use and a Method to Reboot a network card?
    ... I'll post to the WinPCap community to see if there is a known ... Well, ok, protocol drivers like the one that comes with WinPCap usually ... miniport driver underneath them: ... reset request to an underlying driver." ...
    (microsoft.public.development.device.drivers)
  • Re: High DPC Use and a Method to Reboot a network card?
    ... Well, ok, protocol drivers like the one that comes with WinPCap usually ... miniport driver underneath them: ... ProtocolReceivePacket[in e.g. WinPCap] ... which runs in DPC context (i.e. at IRQL = ...
    (microsoft.public.development.device.drivers)
  • Re: Miniport versus intermediate???
    ... as an Ethernet driver and dynamically switch between them? ... "Miniport Driver with a WDM Lower Interface " ... bridge between the Protocol drivers, and the underlying hardware driver (and ... at its upper edge, then at its lower edge it interfaces with a miniport ...
    (microsoft.public.development.device.drivers)

Loading