Re: [2.6.30-rc2] usb reset during big file transfer and ext3 error



Hi, Robert.

On Apr 21 2009, Robert Han*** wrote:
(ccing linux-usb)

Ok.

Rogério Brito wrote:
(...)
Unfortunately, when I was transferring the contents of 2 DVDs from the
main IDE HD to a USB external HD, I got errors from the USB host, the
writes on the external HD become failures and the ext3 filesystem there
enters into error mode, going read-only.

I eventually lose the access to the device (i.e., the /dev/sd??? device
isn't there anymore) and I then have to re-run fsck on the given
filesystem.

This has already happened 2 or 3 times already and I observed that it
only occurs when there is high traffic---if I am, say, compiling the
kernel on that external HD, I don't see any problems.

I just saw it reoccur once more, this time inducing a stacktrace related
to ext3. :-(

Attached is part of the dmesg log that shows the problem. I put the
whole dmesg at <http://rb.doesntexist.org/linux/>.

As always, if any further information is needed, please let me know.

You're seeing these:

[103051.265045] ehci_hcd 0000:00:1d.7: detected XactErr len 1536/4096
retry 1
[103051.265156] ehci_hcd 0000:00:1d.7: detected XactErr len 1536/4096
retry 2
[103051.265281] ehci_hcd 0000:00:1d.7: detected XactErr len 1536/4096
retry 3
[103051.265406] ehci_hcd 0000:00:1d.7: detected XactErr len 1536/4096
retry 4

Precisely.

According to the EHCI spec, XactErr is "Set to a one by the Host
Controller during status update in the case where the host did not
receive a valid response from the device (Timeout, CRC, Bad PID,
etc.)"

Is there any way of controlling the number of retries in the host
controller? Or, perhaps, of controlling the time between retries so that
the device can shape it up again?

Quite likely this is some kind of hardware problem - maybe the USB
port doesn't quite provide enough power for the drive, etc.

I see. The first thing I thought about when I saw this comment of yours
was that there could be some heat issue and the drive not cooling
down.

In this particular case, the USB enclosure is externally powered and it
conatins a SATA drive. I also had never seen it occour before when
connected to an EHCI port on another system, even while transferring
more data.

A lot of these USB enclosure devices are also rather poor quality in
general..

Agreed. Not everybody does things correctly by the book. OTOH, these are
the devices present in "the real world". Would there be workarounds for
such situations?


Thanks, Rogério Brito.

--
Rogério Brito : rbrito@{mackenzie,ime.usp}.br : GPG key 1024D/7C2CAEB8
http://www.ime.usp.br/~rbrito : http://meusite.mackenzie.com.br/rbrito
Projects: algorithms.berlios.de : lame.sf.net : vrms.alioth.debian.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/