Re: sata_sil24 resetting controller...



Mogens Valentin wrote:
Tejun Heo wrote:
Jan Dittmer wrote:

Tejun Heo wrote:

serror=0x0[4297873.266000] sata_sil24 ata1: resetting controller...
[4297873.267000] ata1: status=0x50 { DriveReady SeekComplete }
[4297873.267000] sdc: Current: sense key=0x0
[4297873.267000] ASC=0x0 ASCQ=0x0

The time between these events varies from .5s to up to 10s, resync
speed is
pretty bad (6mb/s) but appears(!) to be working.
This is with vanilla 2.6.17-rc3, sata drivers built into the kernel.
Find below /proc/interrupts and lspci output. Boot dmesg output was
washed
away by above messages, sorry.

What's the cause of the error, can I ignore it or will it destroy
my raid eventually? I'm now about 5% through the resync process,
with an estimated finish in 1260 minutes.


$ lspci -vv -s 03:04.0
0000:03:04.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X
Serial ATA Controller (rev 01)
Subsystem: Silicon Image, Inc.: Unknown device 7124
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping+ SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 32
Interrupt: pin A routed to IRQ 22
Region 0: Memory at fa800000 (64-bit, non-prefetchable) [size=128]
Region 2: Memory at fa000000 (64-bit, non-prefetchable) [size=32K]
Region 4: I/O ports at 9400 [size=16]
Expansion ROM at fe900000 [disabled] [size=512K]
Capabilities: [64] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [40] PCI-X non-bridge device.
Command: DPERE- ERO+ RBC=0 OST=5
Status: Bus=3 Dev=4 Func=0 64bit+ 133MHz+ SCD- USC-,
DC=simple, DMMRBC=2, DMOST=5, DMCRS=4, RSCEM-
So, slow down the PCI-X bus. It can usually be done from BIOS setup
menu. Does your machine has a riser board which extends or changes
orientation of PCI-X bus? Motherboard vendors describe the bus
frequency limit when using riser boards in the manual but sometimes
server vendors forget to set them. Heck, some of them don't even
know what that is.

Hmm I don't have a riser card and I don't have a setting for the
frequency,
nor a jumper.
I plugged the card in another slot, next to a 66MHz only card. So now
I've
it working with 66MHz (checked with lspci), but my drive isn't
initialized
properly anymore:

[4294690.486000] libata version 1.20 loaded.
[4294690.486000] sata_sil24 0000:03:04.0: version 0.23
[4294690.486000] ACPI: PCI Interrupt 0000:03:04.0[A] -> GSI 24 (level,
low) -> IRQ 22
[4294690.487000] ata1: SATA max UDMA/100 cmd 0xF8810000 ctl 0x0 bmdma
0x0 irq 22
[4294690.487000] ata2: SATA max UDMA/100 cmd 0xF8812000 ctl 0x0 bmdma
0x0 irq 22
[4294690.487000] ata3: SATA max UDMA/100 cmd 0xF8814000 ctl 0x0 bmdma
0x0 irq 22
[4294690.487000] ata4: SATA max UDMA/100 cmd 0xF8816000 ctl 0x0 bmdma
0x0 irq 22
[4294690.800000] ata1: SATA link up 3.0 Gbps (SStatus 123)
[4294690.801000] ata1: dev 0 cfg 49:5145 82:0000 83:0000 84:0000
85:0000 86:0000 87:0000 88:0000
[4294690.801000] ata1: dev 0 ATA-0, max MWDMA2, 16514064 sectors: CHS
16383/16/63
[4294690.802000] ata1: dev 0 model number mismatch 'WDC
WD3200re/sasats_li42S' != ''
[4294690.802000] ata1: dev 0 revalidation failed (errno=-19)
[4294690.802000] ata1: failed to revalidate after set xfermode
[4294690.802000] scsi2 : sata_sil24
[4294691.003000] ata2: SATA link down (SStatus 0)
[4294691.003000] scsi3 : sata_sil24
[4294691.204000] ata3: SATA link down (SStatus 0)
[4294691.204000] scsi4 : sata_sil24
[4294691.405000] ata4: SATA link down (SStatus 0)
[4294691.405000] scsi5 : sata_sil24

Can this still be a pci bus problem? I get the same error on every
reboot.
Hmmm.. max MWDMA2? Something is very off with your configuration. Can
you try the card in another box or on a regular PCI slot?

Since moving the card to another slot changes the behaviour/problem, I'm
thinking it might be a mobo implementation problem with slots
interacting WRT IRQ, like in the older PCI-IRQ problem days.

You might try shifting that card and other cards in various slots and
dump the IRQ table for each combination. Maybe simply take out any other
cards you can live without while trying out the various slots.

I shifted the sata card into a 66MHz, 32bit PCI slot now and the
problems went away. Just for the record, this is an Asus PU-DLS
mainboard with E7501 chipset. Now I can dd from all devices without
any error messages, giving me about 360mb/s continuous throughput for
6 devices which isn't that bad I suppose.
The card gets assigned irq 22 in both configurations but in the
latter the irq is shared with the on-board usb-uhci controller
which somehow seems to work better...

Thanks for all your help,

Jan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: Hardware problems
    ... There is no fix for me unfortunately as I cant flash my BIOS back before ... a manufacturer must meet the ACPI standard ... are commonly useful, like the 3COM905b networking card, or the Adaptec ... A request for sound gets routed to IRQ 9, ...
    (microsoft.public.win2000.setup)
  • Re: MCP2 on Intel 945 Mobo advice?
    ... Checking with the Intel site I'm almost sure that this CPU is the Intel 341 Celeron unit, but is not a dual core processor. ... I also note that in the case of the 915GAV boards I proofed these are completely working except for the audio which is still on SBlive 5.1 service with Uniaudio support. ... I'll assume that your basic networking setup is okay and was working on the sytem you migrated from so I think this is a hardware problem with IRQ sharing between your NIC and the rest of the system. ... I tried the SBlive 5.1 audio card here adjacent to the Kingston KT-120 NIC card. ...
    (comp.os.os2.setup.misc)
  • Re: MCP2 on Intel 945 Mobo advice?
    ... You might try setting that and see if you can get a solitary IRQ assigned to the SBLive so that the other devices will go off and find a different IRQ. ... From other OS/2 work before, with the whole thing 'working', but with all the IRQ's the way 'it works', I thought that I would try the option during the boot run to let it examine hardware. ... So I tried Veit Kannegieser's PCIIRQ.ARJ 'force' IRQ suite off his web site both to cross research this with the PCI sniff tests as well as possibly 'fix' the IRQ for at least the SBLive 5.1 PCI add-in card. ...
    (comp.os.os2.setup.misc)
  • BSOD - NETBT.SYS
    ... The common pattern of both bsod was that it happened during internet ... Card) shared its irq with the video card. ... resolve that issue by changing the PCI slot of the isdn card. ...
    (microsoft.public.windowsxp.help_and_support)
  • Re: Cant assign IRQ to another value
    ... Hi, you'd be better off moving the card to another PCI slot, that's if you ... an unused IRQ for the soundcard. ... "AUTO" to "5" for the PCI slot the soundcard is plugged into. ...
    (microsoft.public.windowsxp.general)