Re: How does Linux inform me when a RAID disk has failed?

From: Michael Heiming (michael+USENET_at_www.heiming.de)
Date: 12/05/04


Date: Sun, 5 Dec 2004 09:06:48 +0100

In comp.os.linux.misc Keith Keller <kkeller-usenet@wombat.san-francisco.ca.us>:
> On 2004-12-04, Michael Heiming <michael+USENET@www.heiming.de> wrote:
>> Many on-board raid controller are just some kind of fake raid and
>> are pretty slow, if supported at all. I'd go with softraid or get
>> a real raid controller, 3ware (SATA) or Adaptec (SCSI), both have
>> tools to check, making it pretty easy to write some script
>> running from cron.

> FWIW, the 3ware linux daemon can be configured to send mail to a
> specified address. It sends mail on various events, including when a
> disk is having problems it can detect, and when it deactivates a disk
> due to failure.

Yep, thx for mentioning, almost forgot about "smartd", soon after
trying it out again, while the box I'm currently working on was
already completely tossed, perhaps I wanted to reboot anyway?
Remembered why, smartd didn't really worked for me and almost
always took the complete system down to a grinding halt.;(

Anyway, there's only this box with a 3ware controller, so I don't
really need this tool and will happily forgo on it.

A cron job parsing the output of:

 tw_cli info c0 drivestatus

Works much more reliable, while doing the same, sending a mail if
there's a drive failure.

-- 
Michael Heiming (X-PGP-Sig > GPG-Key ID: EDD27B94)
mail: echo zvpunry@urvzvat.qr | perl -pe 'y/a-z/n-za-m/'
#bofh excuse 431: Borg implants are failing


Relevant Pages

  • The Death and Diagnosis of a Dying Hard Drive - Is S.M.A.R.T. useful?
    ... I pose the following question in the subject, as over the years running smartd and having failed disks, I have always first been alerted of bad sectors and such through dmesg or logcheck. ... I've cc'd the LKML to show that, when a disk is failing I had received similar stat errors, but those were due to buffer / or other disk issues. ... When the command that caused the error occurred, the device was in an unknown state. ...
    (Linux-Kernel)
  • Re: SATA - System Freezes
    ... Is smartd configured correctly for this disk? ... Some smartd actions will take a disk off line for some tests. ... If the BIOS has SMART enabled the drive could still stall ...
    (Fedora)
  • Rash of hard drive failures, am I missing something?
    ... A disk recovry utility allowed me to copy the data on to the ... A few mins later the second partition ... and I can't find the backup I thought I'd made 2 months ago. ... lost data in a drive failure before, ...
    (comp.sys.mac.system)
  • Re: [opensuse] smart and 3ware
    ... smartd set up properly so I didn't think there'd be any data there. ... have one disk that just dropped out of a RAID-5 on a 3ware card because ... one of its SMART attributes entered a failure state. ... 3ware card and smartd will notify of those failures. ...
    (SuSE)
  • Re: SATA - System Freezes
    ... Is smartd configured correctly for this disk? ... Some smartd actions will take a disk off line for some tests. ... Nifty Hat Mitch ...
    (Fedora)