Re: OHCI root_port_reset() deadly loop...



To add some more information here, I think the EHCI idea might
hold some water.

What I have here are two NEC OHCI USB interfaces and one NEC EHCI
USB interface on PCI. Aparently they all go through a shared
USB hub, mapped like this:

HUB Port 1: OHCI #1, EHCI
HUB Port 2: OHCI #2, EHCI
HUB Port 3: OHCI #1, EHCI
HUB Port 4: OHCI #2, EHCI
HUB Port 5: OHCI #1, EHCI

The OHCI ports go out to external USB connectors on the back panel of
the machine, whereas the EHCI is connected up to an internal USB
storage CDROM device and what appears to be another USB hub.

There's actually no such thing as an "EHCI port" or an "OHCI port".
Instead, there's a set of ports, each of which can be switched so
the USB differential data signals go up to either controller.

When EHCI starts, that switch points to EHCI so that devices can try
enumerating with high speed signaling. When a device doesn't respond
to that "chirp", the EHCI root hub driver switches the port to the
companion controller. (Which is OHCI here, UHCI on some PCs, etc.)


The problem seems to be very strongly tied to timing. For example
simply adding "ignore_loglevel" to the kernel boot command line can
make the problem go away.

This got me thinking about your EHCI comment.

If these controllers are going through the same HUB, things might go
south if OHCI initialized first, then khubd et al. are asynchronously
accessing the segments behind OHCI at the same time that the EHCI
driver is initializing. Perhaps, this is the kind of sequence of
events which makes one of the root ports reset in such a way that the
the reset bit never clears.

Given that this machine has 64 cpus, the likelyhood for such parallel
accesses is very likely :-)

Does this make any sense?

Yes, that's why I asked about EHCI. My speculation would be that
OHCI starts the reset, and EHCI claims the port before it completes;
or contrariwise OHCI starts the reset right after EHCI claims it.

And there's some point in that process where a hardware race makes
the trouble you've observed. I believe there are plenty of other
places where it's perfectly fine if EHCI grabs the port, or this
little race would have shown up many times before.

- Dave


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: OHCI root_port_reset() deadly loop...
    ... What I have here are two NEC OHCI USB interfaces and one NEC EHCI ... HUB Port 1: OHCI #1, ...
    (Linux-Kernel)
  • Re: OHCI root_port_reset() deadly loop...
    ... What I have here are two NEC OHCI USB interfaces and one NEC EHCI ... HUB Port 1: OHCI #1, ...
    (Linux-Kernel)
  • Re: LENOVO USB-1 or 2???
    ... It might have only had USB 1 ports. ... The error message described comes from the EHCI port ... USB2 or has become damaged in some way. ...
    (comp.sys.laptops)
  • Re: USB2 weirdness (continued)
    ... support for USB 2.0 hubs is not my main problem with EHCI ... ehci_intr1: door bell ... port 1, set config at addr 2 failed, error=TIMEOUT ...
    (freebsd-current)
  • Re: USB EHCI & OHCI drivers on Linux
    ... >We are planning to port the USB host controller driver from linux to ... >controller to work in EHCI mode? ... in simple words, is the OHCI ... >implementation subset of EHCI implementation? ...
    (comp.arch.embedded)