2.6.1 IO lockup on SMP systems
From: Sergey S. Kostyliov (rathamahata_at_php4.ru)
Date: 01/31/04
- Previous message: Mans Matulewicz: "Re: ide-cdrom / atapi burning bug - 2.6.1"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
To: linux-kernel@vger.kernel.org Date: Sat, 31 Jan 2004 19:40:27 +0300
Hello all,
I had experienced a lockups on three of my servers with 2.6.1. It doesn't
look like a deadlock, the box is still pingable and all tcp ports which were
in listen state before a lockup are remains in listen state, but I can't get
any data from this ports. According to sar(1) systems had not been overloaded
right before a lockup. And there is no log entries in all user services logs
for almost 10 hours after lockup.
So I think this is an IO lockup. On the other side it doesn't look like a bug
in particular controller driver, because they are different for each box.
And finally it doesn't look like a bug in particular io-scheduler because two
of boxes were runed with "deadline" and one with "as". Of course all
assumptions are valid only if all lockups I had seen have the same nature.
All of three boxes are SMP. Unfortunately all are remote and aren't attached
to a serial console yet (this is planed in next couple of weeks).
1) ope
01:02.1 RAID bus controller: Mylex Corporation: Unknown device 0050 (rev 02)
elevator=deadline
.config: http://sysadminday.org.ru/2.6.1-io_lockup/ope/.config
lspci: http://sysadminday.org.ru/2.6.1-io_lockup/ope/lspci
lspci -vvn: http://sysadminday.org.ru/2.6.1-io_lockup/ope/lspci_-vvn
2) white
02:04.0 RAID bus controller: American Megatrends Inc. MegaRAID (rev 02)
elevator=deadline
.config: http://sysadminday.org.ru/2.6.1-io_lockup/white/.config
lspci: http://sysadminday.org.ru/2.6.1-io_lockup/white/lspci
lspci -vvn: http://sysadminday.org.ru/2.6.1-io_lockup/white/lspci_-vvn
3) tiny
02:00.0 Unknown mass storage controller: Compaq Computer Corporation Smart-2/P RAID Controller (rev 03)
03:00.0 Unknown mass storage controller: Compaq Computer Corporation Smart-2/P RAID Controller (rev 03)
elevator=as
.config: http://sysadminday.org.ru/2.6.1-io_lockup/tiny/.config
lspci: http://sysadminday.org.ru/2.6.1-io_lockup/tiny/lspci
lspci -vvn: http://sysadminday.org.ru/2.6.1-io_lockup/tiny/lspci_-vvn
Any hints will be appreciated.
--
Best regards,
Sergey S. Kostyliov <rathamahata@php4.ru>
Public PGP key: http://sysadminday.org.ru/rathamahata.asc
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Previous message: Mans Matulewicz: "Re: ide-cdrom / atapi burning bug - 2.6.1"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
- Re: 2.6.12-rc2-mm3
... Otherwise the kernels seem to work fine -- no lockup unless ... installer
couldn't handle it at that time. ... Workload is normal, the lockups happen with just X
and Azaereus. ... send the line "unsubscribe linux-kernel" in ... (Linux-Kernel) - Re: nforce2 lockups
... > Windows 2000 flawlessly, but lockup in a minute under Linux. ...
> and disabled the onboard IDE and still had lockups. ... send the line "unsubscribe
linux-kernel" in ... (Linux-Kernel) - Re: [patch] Real-Time Preemption, -RT-2.6.10-rc2-mm1-V0.7.27-3
... > might have been coincidence. ... managed to reproduce the lockup
on my testbox, using your .config, ... Will turn on the NMI watchdog now, hopefully this lockup
will ... send the line "unsubscribe linux-kernel" in ... (Linux-Kernel) - Re: [patch 0/3] j_state_lock, j_list_lock, remove-bitlocks
... On Thu, 2005-03-17 at 11:23 -0500, Steven Rostedt wrote: ... Is it a performance
regression, or a latency issue, or a ... I think you can get this lockup whether
or not it ... send the line "unsubscribe linux-kernel" in ... (Linux-Kernel) - Re: Fixes for nforce2 hard lockup, apic, io-apic, udma133 covered
... I just had a lockup running with preempt, ... >> Survived my greptest
which no non patched kernel has ever done on this ... > To match dmesg output
try ... send the line "unsubscribe linux-kernel" in ... (Linux-Kernel)