2.4.26: IDE drives become unavailable randomly

From: Andre Costa (costa_at_tecgraf.puc-rio.br)
Date: 06/30/04

  • Next message: jsimmons_at_pentafluge.infradead.org: "Re: [PATCH] asiliantfb fixes"
    Date:	Wed, 30 Jun 2004 08:41:42 -0300
    To: linux-kernel@vger.kernel.org
    
    

    (please cc me on any replies, because I am not subscribed to this list;
    if I do need to subscribe, just let me know)

    Hi,

    I am using 2.4.26 SMP on a ABIT AT7 mobo, with a 2.8GHz P4 processor
    with hyper-threading enabled. I have one 80GB Seagate IDE disk
    as /dev/hda, and from time to time it seems to "disappear", usually
    after these messages appear a couple of trimes on/var/log/messages:

    Jun 27 17:15:00 dali kernel: hda: status timeout: status=0x80 { Busy }
    Jun 27 17:15:00 dali kernel:
    Jun 27 17:15:00 dali kernel: hda: drive not ready for command
    Jun 27 17:15:03 dali kernel: ide0: reset: success

    I already had some ide-related issues, namely the one mentioned here:

    http://www.x86-64.org/lists/discuss/msg04679.html

    Due to that, I am booting with:

    hdc=ide-scsi apm=off acpi=ht noapic

    Turning off APIC and keeping ACPI to a minimum seems to have fixed the
    "dma status == 0x24" problem, but I still experience the "status
    timeout" above, which is very frustrating because this is supposed to be
    a server for our intranet.

    I tried turning off APM for this disk with 'hdparm -B255 /dev/hda', but
    it didn't work:

    hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
    hda: drive_cmd: error=0x04 { DriveStatusError }

    I have turned off spindown with 'hdparm -S0 /dev/hda', but frankly I am
    not sure this will help (besides being bad for harddisk lifetime).

    So, given this scenario, I would really appreciate any suggestions on
    how to workaround this issue... Please, let me know if you need
    additional info. I am attaching below the output of 'hdparm -I /dev/hda'
    in case it helps. I am running Fedora Core 1.

    TIA

    Andre

    -------- output of 'hdparm -I /dev/hda' --------

    /dev/hda:

    ATA device, with non-removable media
            Model Number: ST380011A
            Serial Number: 3JV78385
            Firmware Revision: 3.06
    Standards:
            Used: ATA/ATAPI-6 T13 1410D revision 2
            Supported: 6 5 4 3
    Configuration:
            Logical max current
            cylinders 16383 65535
            heads 16 1
            sectors/track 63 63
            --
            CHS current addressable sectors: 4128705
            LBA user addressable sectors: 156301488
            LBA48 user addressable sectors: 156301488
            device size with M = 1024*1024: 76319 MBytes
            device size with M = 1000*1000: 80026 MBytes (80 GB)
    Capabilities:
            LBA, IORDY(can be disabled)
            bytes avail on r/w long: 4 Queue depth: 1
            Standby timer values: spec'd by Standard
            R/W multiple sector transfer: Max = 16 Current = 16
            Recommended acoustic management value: 128, current value: 0
            DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5
                 Cycle time: min=120ns recommended=120ns
            PIO: pio0 pio1 pio2 pio3 pio4
                 Cycle time: no flow control=240ns IORDY flow control=120ns
    Commands/features:
            Enabled Supported:
               * READ BUFFER cmd
               * WRITE BUFFER cmd
               * Host Protected Area feature set
               * Look-ahead
               * Write cache
               * Power Management feature set
                    Security Mode feature set
               * SMART feature set
               * FLUSH CACHE EXT command
               * Mandatory FLUSH CACHE command
               * Device Configuration Overlay feature set
               * 48-bit Address feature set
                    SET MAX security extension
               * DOWNLOAD MICROCODE cmd
               * SMART self-test
               * SMART error logging
    Security:
                    supported
            not enabled
            not locked
            not frozen
            not expired: security count
            not supported: enhanced erase
    HW reset results:
            CBLID- above Vih
            Device num = 0 determined by the jumper
    Checksum: correct

    -- 
    Andre Oliveira da Costa
    (costa@tecgraf.puc-rio.br)
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at  http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at  http://www.tux.org/lkml/
    

  • Next message: jsimmons_at_pentafluge.infradead.org: "Re: [PATCH] asiliantfb fixes"