Re: FailSafe Event (md)?



Mike wrote:

[snip]

On this machine I execute:

-------------------
$ cat /proc/mdstat
Personalities : [raid5] [raid4] [raid1]
md0 : active raid1 sdf1[2](S) sde1[3](S) sdd1[4](S) sdc1[5](S) sdb1[1] sda1[0]
104320 blocks [2/2] [UU]

md1 : active raid1 sdf3[2](S) sde3[3](S) sdd3[4](S) sdc3[5](S) sdb3[1] sda3[0]
3068288 blocks [2/2] [UU]

md2 : active raid5 sdf2[4] sde2[5](F) sdd2[3] sdc2[2] sdb2[1] sda2[0]
560732160 blocks level 5, 256k chunk, algorithm 2 [5/5] [UUUUU]

unused devices: <none>
-------------------


Does the email message mean drive sde2[5] has failed? I know the sde2 refers
to the second partition of /dev/sde. Here is the partition table


Yes. That's why there's [F] by it.


I have partition 2 of drive sde as one of the raid devices for md. Does the (S)
on sde3[2](S) mean the device is a spare for md1 and the same for md0?

Yes.

Mike wrote:
On 2007-01-11, stan@xxxxxxxxxxxxxxxxxx <stan@xxxxxxxxxxxxxxxxxx> wrote:
Sounds like one of your hard drives died and was replaced
by a hot spare within the raid.

In which case you'll want to yank the bad disk and replace it
as you likely no longer have a hot spare.


Raid 5 can get along without a spare, but with degraded performance.
I've never gotten the email you got, but FailSpare does sound like a
failover to a spare, as opposed to failover to parity.

madam -QD /dev/md2

(query with details) may help interpret the data for you.


With AIX boxes I can blink a drive. Is there a way to blink the LED's
of a drive in linux? This is a dell box running Fedora Core 5 with
recent patches.

If you mean hot swapping, it depends on the controller HW. Hot swapping
might be supported, but probably not. If you remove the drive, you must
first manually fail the other partitions on the drive, too.

mdadm /dev/md0 -f /dev/sde1
madam /dev/md1 -f /dev/sde3

Then remove the partitions.

mdadm /dev/md0 -r /dev/sde1
mdadm /dev/md1 -r /dev/sde3
mdadm /dev/md2 -r /dev/sde2

Then swap out the hardware and partition the new drive. I would also
recommend reconsidering your raid scheme. To have multiple arrays on
the same drives guarantees that one drive failure/replacement affects
multiple arrays, like now. Depending on how much downtime (rebooting
and rebooting and rebooting) you can endure, you might be able to make
some changes.

Once the drive is replaced, will md automatically detect the replacement
and rebuild as necessary?

After partitioning, you would add the new partitions back into the
array(s) with

mdadm <array device> -a <partition device>

mdadm handles the rest.
.



Relevant Pages

  • Re: Replacing a raid 5 controller with different card
    ... at least for the part not depending on S-ATA or the RAID ... drives present to the OS at the bios level. ... raid controller and you make an array based on the number of drives ... In the operating system you make partitions. ...
    (microsoft.public.windows.server.sbs)
  • Re: Migrating to Raid
    ... RAID would do... ... It is unclear whether you want to migrate the two existing 200GB drives into ... I note you do not have shared partitions for /home /usr etc. in your ... server - and a cheap box for that would cost less than your RAID card! ...
    (alt.os.linux.suse)
  • Re: Installing Exchange for performance
    ... > Not sure what you mean by 4 pairs of mirrored RAID drives... ... > mirroring offers better performance because of sequential writes. ... >> needs partitions for the SMTP queues, ...
    (microsoft.public.exchange.setup)
  • Re: New install of SBS and where to put SQL and exchange
    ... Both raid arrays are physically different, ... SBS is one busy machine, exchange and sql, are ... Partitions are good for organizing data and splitting things up to make data ... drives with a multi-channel RAID controller? ...
    (microsoft.public.windows.server.sbs)
  • Re: Mover SQL server
    ... expansion of C my recommendation is to (providing this is hardware Raid) ... expand the partitions on Imaging. ... repartition (or possibly replace the hard drives with a bigger ...
    (microsoft.public.windows.server.sbs)