Re: harddisk or kernel problem?

From: Bruce Allen (ballen_at_gravity.phys.uwm.edu)
Date: 02/18/04

  • Next message: James Morris: "[SELINUX] Event notifications via Netlink"
    Date:	Wed, 18 Feb 2004 10:55:05 -0600 (CST)
    To: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>
    
    

    > On Monday 16 of February 2004 00:34, Nico Schottelius wrote:
    > > Bartlomiej Zolnierkiewicz [Fri, Feb 13, 2004 at 05:17:34PM +0100]:
    > > > [ ... ]
    > > > Check your disk with SMART tools: http://smartmontools.sf.net.
    > >
    > > Before I continue to report what I've found out:
    > >
    > > Thank you all for your good help!
    > >
    > > I'm really down as this is the second disk
    > > dyeing within two month (and the second 2.5" hd even, I begin to think
    > > notebooks don't like me :/).
    >
    > :-( sh*t happens
    >
    > > I currently collect all data I get / find out to
    > >
    > > http://schotteli.us/~nico/hd-problem.02/
    > > (renamed it, as this does not look like a kernel problem and I don't
    > > want somebody believe this).
    > >
    > > Currently I don't really understand the output of smartctl and cannot
    > > say what causes the error. Perhaps someone can give me a hint on this?
    >
    > The most important things are: READ DMA errors were logged,
    > SMART self tests (short and extended) completed with read failure.

    FWIW, after reading this thread, I've slightly modified smartmontools so
    that when smartctl prints the error log (-l error) it ALSO prints the LBA
    at which a READ or WRITE command failed.

    [Note that this is a 28-bit sector address. If a disk is larger than 2^37
    Bytes = 137 GB, then some LBAs can't be written in 28 bits, in which case
    there won't be a summary error log entry. If the disk is smaller than
    2^37 Bytes then the failing LBA address should always be logged.]

    This feature is in smartmontools releases AFTER 5.27, or from the
    smartmontools CVS server.

    In the case of Nico's disk the failed READ LBA address can be computed
    from the error log registers at
    http://schotteli.us/~nico/hd-problem.02/smartctl-a:

      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      40 51 01 32 bb 7e e0 Error: UNC

    yielding:

      LBA = 0x007ebb32 (bits 0-3 of DH, CH, CL and SN)

    which can be compared with the LBA of the first error in the self-test
    log:

    # 1 Extended offline Completed: read failure 90% 1633
    0x007ebb32

    and with the LBA logged in SYSLOG:
    hda: dma_intr: error=0x40 { UncorrectableError }, LBAsect=8305458,
    sector=8305454

    Since 8305458 = 0x7EBB32 it's a pretty open and shut case.

    ****************************************************************

    Executive summary: smartmontools versions > 5.27 will print this:

      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      40 51 01 32 bb 7e e0 Error: UNC at LBA = 0x007ebb32 = 8305458

    rather than this:

      ER ST SC SN CL CH DH
      -- -- -- -- -- -- --
      40 51 01 32 bb 7e e0 Error: UNC

    Cheers,
            Bruce

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: James Morris: "[SELINUX] Event notifications via Netlink"

    Relevant Pages

    • RE: Stop 0x7B after upgrading BIOS on Intel SE7520BD2 server...
      ... disk or disk controller, an incompatible device driver, disk cabling ... However, there are potential problems with LBA, such as: ... The server is cconfigured with two SATA drives running off the system board ... SATA ports) for the boot drive It also has an Intel SRCS16 RAID ...
      (microsoft.public.windows.server.general)
    • RE: Stop 0x7B after upgrading BIOS on Intel SE7520BD2 server...
      ... disk or disk controller, an incompatible device driver, disk cabling ... However, there are potential problems with LBA, such as: ... The server is cconfigured with two SATA drives running off the system board ... SATA ports) for the boot drive It also has an Intel SRCS16 RAID ...
      (microsoft.public.windows.server.general)
    • Re: Max. HD size Ultra5
      ... used with the old PIO modes). ... You can use 300GB disk in PIO mode if You want, ... LBA in DMA mode, only 28 bits LBA (they can do 48 bits LBA but only ... The IDE *driver* on SPARC only does 28 bit LBA; ...
      (comp.sys.sun.hardware)
    • Re: Breaking 137GB Barrier 440BX
      ... card" that would either extend the main m/b BIOS or run in place of it. ... LBA translator up in memory and cooking? ... partition tables need adjustment. ... use a large drive as long as you're happy to continue booting off a smaller disk. ...
      (comp.sys.ibm.pc.hardware.storage)
    • Re: disk signature
      ... resource group generates error 21 in error log and cluster log on mving ... I belive this errors are due to diff disk signature .How can i check ...
      (microsoft.public.windows.server.clustering)

    Loading