2.6.1 IO lockup on SMP systems

From: Sergey S. Kostyliov (rathamahata_at_php4.ru)
Date: 01/31/04

  • Next message: Greg Norris: "Re: net-pf-10, 2.6.1"
    To: linux-kernel@vger.kernel.org
    Date:	Sat, 31 Jan 2004 19:40:27 +0300
    
    

    Hello all,

    I had experienced a lockups on three of my servers with 2.6.1. It doesn't
    look like a deadlock, the box is still pingable and all tcp ports which were
     in listen state before a lockup are remains in listen state, but I can't get
    any data from this ports. According to sar(1) systems had not been overloaded
    right before a lockup. And there is no log entries in all user services logs
    for almost 10 hours after lockup.

    So I think this is an IO lockup. On the other side it doesn't look like a bug
     in particular controller driver, because they are different for each box.
    And finally it doesn't look like a bug in particular io-scheduler because two
    of boxes were runed with "deadline" and one with "as". Of course all
    assumptions are valid only if all lockups I had seen have the same nature.

    All of three boxes are SMP. Unfortunately all are remote and aren't attached
    to a serial console yet (this is planed in next couple of weeks).

    1) ope
    01:02.1 RAID bus controller: Mylex Corporation: Unknown device 0050 (rev 02)
    elevator=deadline
    .config: http://sysadminday.org.ru/2.6.1-io_lockup/ope/.config
    lspci: http://sysadminday.org.ru/2.6.1-io_lockup/ope/lspci
    lspci -vvn: http://sysadminday.org.ru/2.6.1-io_lockup/ope/lspci_-vvn

    2) white
    02:04.0 RAID bus controller: American Megatrends Inc. MegaRAID (rev 02)
    elevator=deadline
    .config: http://sysadminday.org.ru/2.6.1-io_lockup/white/.config
    lspci: http://sysadminday.org.ru/2.6.1-io_lockup/white/lspci
    lspci -vvn: http://sysadminday.org.ru/2.6.1-io_lockup/white/lspci_-vvn

    3) tiny
    02:00.0 Unknown mass storage controller: Compaq Computer Corporation Smart-2/P RAID Controller (rev 03)
    03:00.0 Unknown mass storage controller: Compaq Computer Corporation Smart-2/P RAID Controller (rev 03)
    elevator=as
    .config: http://sysadminday.org.ru/2.6.1-io_lockup/tiny/.config
    lspci: http://sysadminday.org.ru/2.6.1-io_lockup/tiny/lspci
    lspci -vvn: http://sysadminday.org.ru/2.6.1-io_lockup/tiny/lspci_-vvn

    Any hints will be appreciated.

    -- 
                       Best regards,
                       Sergey S. Kostyliov <rathamahata@php4.ru>
                       Public PGP key: http://sysadminday.org.ru/rathamahata.asc
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at  http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at  http://www.tux.org/lkml/
    

  • Next message: Greg Norris: "Re: net-pf-10, 2.6.1"

    Relevant Pages