Re: bad blocks on raid5 cause filesystem failure

From: Michael (mhyman_at_yahoo.com)
Date: 09/30/05

  • Next message: Michael: "Re: System freezing"
    Date: Thu, 29 Sep 2005 23:21:48 -0700
    
    

    In article <1127329313.446678.325740@g47g2000cwa.googlegroups.com>,
    alazarev@itg.uiuc.edu says...
    > Thanks for the informative post. I've got a few questions though.
    >
    > 1) Do you have a link to the report that you read which describes the
    > probablity of double fault. Sounds like an interesting read for me.
    >
    > 2) Correct me if I'm wrong, but if two blocks on a drive, happen to
    > fail at the same time, before rebuild can finish parity on the first,
    > then you will have a problem, unless you have double parity? Fine, but
    > then what about 3 bad blocks in a row. At some point, the RAID
    > controller should, like you say, stop all host IO and report the drive
    > failed, and then rebuild the drive from parity. How many bad blocks in
    > a row should cause this drive failure, three or more, right? Since we
    > saw about 10 bad block failures all with the same time stamp, double
    > parity would not have helped us at all. The only thing that would have
    > helped us is a RAID controller that would stop IO to the host. Instead,
    > our RAID still provided "fake access" for the host and thus the fs
    > failure. Sound ligit to you? Any idea what functionality this is
    > called, so I know to avoid it when shopping around for new RAID? I
    > suppose SCSI provides much better reliability in this respect. Too bad,
    > we are already in the SATA hole. Too much data to afford moving it to
    > SCSI.
    >
    > 3) Double parity is also called RAID 6, right? Does RAID 6 provide
    > double parity at the block level? Or only at the drive level?
    >
    > Thanks,
    >
    > Alex
    >
    >

    Having even a single bad block, when that block happens to be in the
    super-block area or the VTOC, will cause exactly the FS error you are
    seeing:

    Sep 7 01:29:52 zeus kernel: attempt to access beyond end of device
    Sep 7 01:29:52 zeus kernel: sdb1: rw=1, want=8072683984,
    limit=2927171457

    The FS can't determine the bounds of the partition because the partition
    map is damaged. FSCK will keep seuqntially scanning the disk until it
    hits the end of the disk, if I remember correclty.

    You are 100% right though about the RAID array not working properly. I
    hate to say it, but this is an area where Linux has some maturing to do.
    The kernel, the FS and the RAID array are not tightly coupled and
    therefore allow for these things to slip through the cracks. The RAID
    array saw the bad block, but did not reallocate a good block and add the
    bad one to the list, from what you were describing. Then the FS never
    knew there was a problem until it couldn't read critical areas of the
    disk and at that point you are screwed.

    What could you have done at the point when you discovered the problems?
    Maybe copied the data to another FS, if you even had that much space.
    You mention a sequence of events leading up to the big-failure, but the
    one and only block that might have failed could have killed it, it just
    took a bit longer in your case.

    I hope you do research arrays and the post your results, because it is a
    big issue, at least as I see it.

    Regards...Michael


  • Next message: Michael: "Re: System freezing"

    Relevant Pages

    • Re: bad blocks on raid5 cause filesystem failure
      ... Do you have a link to the report that you read which describes the ... fail at the same time, before rebuild can finish parity on the first, ... At some point, the RAID ... a row should cause this drive failure, three or more, right? ...
      (comp.os.linux.hardware)
    • Re: Page File on D partition?
      ... The heads are constantly moving from the area of the RAID array ... > Also, don't forget, RAID 5 needs storage for parity so in effect you ... > separating pagefiles from other data (Can't say for sure but I'd like ...
      (microsoft.public.backoffice.smallbiz2000)
    • Re: aac0: COMMAND 0xffffffffxxxxxxxx TIMEOUT AFTER xx SECONDS
      ... The 2410SA uses SATA discs so I'm assuming that the cables are okay. ... 6MB/s sounds like you aren't getting any help from the card's write cache; its having to do stripe reads to recalculate parity instead of doing full stripe writes. ... Many cards disable write-back cache if the battery module isn't present -- make sure you have one and its working. ... I'm still slightly uncomfortable with the idea of software RAID, but it hasn't lost anything yet, in spite of a few "unplanned outages". ...
      (freebsd-current)
    • Re: AIX V5.3 & FASTT500 PERFORMANCE TUNING
      ... calculate the parity data every time a write is done, there is a decrease on performance when compared with reads, which doesn’t require the parity calculation. ... On a RAID_10, there is no parity calculation on either read or write, but there’s almost always a small slowdown in the write performance, due to the disk internals. ... commonly used implementation of RAID, Level 4 provides block-level striping with a parity disk. ... the information contained in this communication ...
      (AIX-L)
    • Re: Best Raid Level for Streaming?
      ... RAID 3: Striping and Parity ... In RAID level 3, data is striped across a set of disks. ... is generated and stored on a dedicated disk. ... In RAID level 5, both parity and data are striped across a set of disks. ...
      (microsoft.public.windowsmedia.server)