Re: Raid: software or hardware



Anton Ertl wrote:
Matthew Wild <M.Wild@xxxxxxxx> writes:
Personally, I've generally preferred hardware RAID. I've had software
RAID systems not notice a drive failing,

Which software RAID was that?

Standard Linux md RAID.

causing corruption of the
filesystem.

How does not noticing a failing drive cause the corruption of the file
system?

Because as one drive is failing it is corrupting any accesses to that disk.

The better hardware RAID cards seem to be better at sniffing
out a failing disk and removing it from an array before any problems can
ensue.

What kind of sniffing and problems are you imagining? If a disk
fails, that's a very obvious thing (it either delivers errors or does
not talk to the controller at all), and a RAID (except RAID0) is there
to prevent problems from failing disks. Even SMART data is not very
good at predicting disk failures (cf. the Google study), and anyway,
with a RAID you can afford to wait until it really fails.

I was just pointing out that while a disk is failing, the software RAID
has not noticed and limped along still trying to use the failing disk
which happily provides garbage when accessed. In my experience, the
hardware RAID systems I have used, 3Ware, Digital/Compaq/HP RA8000,
MA8000, MSA1500 (quite horrible to administer), manage their arrays
fairly conservatively and drop disks pretty quickly.

If a drive fails, software and hardware RAID are equivalent. Simply
replace the drive. The md software in Linux will automatically rebuild
the drive as needed. The "hardware" solution? Who knows, but we presume
it will do the same.

All hardware RAID systems I've used do this and more. They'll even
e-mail you with any alerts on the system.

Is that something special? Look up "mdadm --monitor --mail".

No it's not special. I already use the mdadm monitor. I've just had
occasions where, certainly on mirrored drives, md has not seen the
failing disk at all.

The real difference is what happens if you suffer a controller failure.
Typically, if using hardware RAID (say RAID 5), you will have to replace
the mainboard/controller with an EXACT match. Which puts a lot of trust
in your vendor. With Software RAID, you put the drives into a compatible
Linux box, and things "just work".

I wouldn't have thought this is such a major problem if you use a major
RAID system vendor. 3Ware or Areca are likely to be around for a while.

Maybe. But will they still have that RAID system on offer? And how
long does it take you to get it? If you don't have the replacement
on-site that means quite a bit of downtime.

If you're considering those timescales, you shouldn't really be relying
on any component part of your system to last that long and need a plan
in place for repopulation/replacement of the system.

BTW, the OP was not asking about a 3Ware RAID controller, but about a
JMicron on-motherboard controller. Who knows how long JMicron will be
around, and whether it will be possible to get a compatible RAID
controller or motherboard when the first one dies in a few years.

I know, I only put my oar in because the discussion was drifting on to
the larger question.

For really important data systems, you should be thinking of having
spares ready to be installed, or go for a proper external RAID system
with hot-swap everything and support contracts to match.

Yes. For every server we buy, we buy a second one as a spare. We do
not spend a lot of money for redundancy within each server, except for
an md RAID1, which is cheap.

It really depends how critical your service is ;-). We tend to try to
stick to a few basic systems and therefore only need a spare of each
type or the components thereof. Just recently I has a CPU fan on one of
our 20TB storage servers, fly off the heatsink and take out a few
capacitors on the motherboard! Server carried on fine until it was
shutdown to replace with the spare m'board and CPU's

However, I doubt that the OP is in the market for that.

I agree.
.



Relevant Pages

  • Re: Replacing SBS2000 - Need Guidance
    ... If the box wasn't failing and the eventual ... If I'm right SolidWorks can be backended by SQL, ... There's a nice layout with 6 drives: ... 2*duplexed RAID controllers. ...
    (microsoft.public.windows.server.sbs)
  • Re: Raid: software or hardware
    ... RAID systems not notice a drive failing, ... Which software RAID was that? ... How does not noticing a failing drive cause the corruption of the file ... In my experience md is overly conservative and drops drives pretty ...
    (comp.os.linux.hardware)
  • Re: Adaptec Storage Manager reports "No controllers were found in this system"
    ... In this configuration, the PERC320 is also running at 66Mhz. ... 5 drives in each bay (with the storage tray setup in a 2x7 ... RAID systems work well writing a single large file but suck ... The performance of a RAID system depends on many things. ...
    (comp.os.linux.hardware)
  • Re: [kde-linux] Kmail
    ... the likelyhood of both failing at the same time is minimal ... and put /home on a raid for the price of a couple drives ... RAID is not a substitute for backups. ... offsite backup will not be. ...
    (KDE)
  • Re: Failing hard drive (non OS) - question about mirroring
    ... I was planning to remove both hard drives without wiping, ... new ones in to restore from the backup - therefore still having the intact ... RAID is supposed to protect you from exactly this scenario (even software ... which disk is failing, I'd just pull it and replace. ...
    (microsoft.public.windows.server.sbs)