Re: FYI: RAID5 unusably unstable through 2.6.14



On Fri, 3 Feb 2006, Martin Drab wrote:

On Thu, 2 Feb 2006, Bill Davidsen wrote:

Just to state clearly in the first place. I've allready solved the problem
by low-level formatting the entire disk that this inconsistent array in
question was part of.

So now everything is back to normal. So unforunatelly I would not be able
to do any more tests on the device in the non-working state.

I mentioned this problem here now just to let you konw that there is such
a problematic Linux behviour (and IMO flawed) in such circumstances, and
that perhaps it may let you think of such situations when doing further
improvements and development in the design of the block device layer (or
wherever the problem may possibly come from).

Perhaps maybe rather of the SCSI layer, than of the block layer ??

And I also hope you would understand, that I wouldn't try to create that
state again deliberatelly, since my main system is running on that array
and I wouldn't risk loosing some more data because of this.

However maybe someone perhaps in Adaptec or smewhere else may have
some simillar system at the disposal on which he could allow to experiment
on demand without any serious risk of loosing anything important.

So what I may say is that it is an Adaptec 2410SA with 8205 firmware and
without a battery backup system (which is probably the crutial thing).
And the inconsistency was caused by a MB protection of CPU overheat
shutdown, because I've started the system and booted Linux from the array
in question (which consisted by just one part of one disk), while I've
forgotten to turn on the water cooling of the CPU and northbridge. So
after about 3 minutes the system automatically shut down and Linux was
probably doing some writing in that very moment, which wasn't able to
complete fully (most probably due to the lack of the battery backup system
on the RAID controller). So my guess is that this may be artificially
reproduced when you suddenly switch off a power source of the system while
Linux is doing some writing to the array.

My arrays in particular are:

HDD1 (160 GB): 120 GB Array 1, 40 GB Array 2
HDD2 (120 GB): 120 GB Array 1
HDD3 (120 GB): 120 GB Array 1
HDD4 (120 GB): 120 GB Array 1

Where Array 1 is a RAID 5 array /dev/sdb (labeled as "Data 1"), which
contains just one 330 GB partition /dev/sdb1, and Array 2 is a bootable
(in Adaptec BIOS setup so called) Volume array (i.e. no RAID) /dev/sda
(labeled as "SYSTEM"), which contains /dev/sda1 (NTFS Windows), /dev/sda2
(ext3 Linux), /dev/sda3 (Linux swap). Problem was accessing the whole Array 2.
Array 1 from Linux worked well.

Then, when I tried, the array checking function within the BIOS of the
Adaptec controller found an inconsistency on the position somewhere in the
middle of the /dev/sda, so somewhere within the /dev/sda2 in particular.
So I low-level formatted the entire HDD1, resynced the Array 1 (which is
RAID 5, so no problem) and reinstalled both systems in Array 2, and now it
is all back to normal again.

Martin Drab wrote:

Well, I had a similar experience lately with the Adaptec AAC-2410SA RAID 5
array. Due to the CPU overheating the whole box was suddenly shot down by
the CPU damage protection mechanism. While there is no battery backup on
this particular RAID controller, the sudden poweroff caused some very
localized inconsistency of one disk in the RAID. The configuration was 1x160
GB and 3x120GB, with the 160 GB being split into 120 GB part within the RAID
5 and a 40 GB part as a separate volume. The inconsistency happend in the 40
GB part of the 160 GB HDD (as reported by the Adaptec BIOS media check). In
particular the problem was in the /dev/sda2 (with /dev/sda being the 40 GB
Volume, /dev/sda1 being an NTFS Windows system, and /dev/sda2 being ext3
Linux system).

Now, what is interesting, is that Linux completely refused any possible
access to every byte within /dev/sda, not even dd(1) reading from any
position within /dev/sda, not even "fdisk /dev/sda", nothing. Everything
^^^^^^^^^^^^^^
ended up with lots of following messages:

sd 0:0:0:0: SCSI error: return code = 0x8000002
sda: Current: sense key: Hardware Error
Additional sense: Internal target failure
Info fld=0x0
end_request: I/O error, dev sda, sector <some sector number>

But /dev/sda is not a Linux filesystem, running fsck on it makes no sense. You
wanted to run on /dev/sda2.

But I was talking about fdisk(1). This wasn't a problematic behaviour of a
filesystem, but of the entire block device.

I've consulted this with Mark Salyzyn, because I thought it was a problem of
the AACRAID driver. But I was told, that there is nothing that AACRAID can
possibly do about it, and that it is a problem of the upper Linux layers
(block device layer?) that are strictly fault intollerant, and thouth the
problem was just an inconsistency of one particular localized region inside
/dev/sda2, Linux was COMPLETELY UNABLE (!!!!!) to read a single byte from
the ENTIRE VOLUME (/dev/sda)!

The obvious test of this "it's not us" statement is to connect that one drive
to another type controller and see if the upper level code recovers. I'm
assuming that "sda" is a real drive and not some pseudo-drive which exists
only in the firmware of the RAID controller.

/dev/sda is a 40 GB RAID array consisting of just one 40 GB part of one
160 GB drive. But it is in fact a virtual device supplied by the
controller. I.e. this 40 GB part of that disc behaves as an entire
harddisk (with it's own MBR etc.). And it is at the end of the drive, so
it may be a little tricky to find the exact position of the partitions
there, but it may be possible.

That message is curious, did you
cat /proc/scsi/scsi to see what the system thought was there? Use the infamous
"cdrecord -scanbus" command?

----------
$ cdrecord -scanbus
Cdrecord-Clone 2.01.01a03-dvd (i686-pc-linux-gnu) Copyright (C) 1995-2005 J�g Schilling
Note: This version is an unofficial (modified) version with DVD support
Note: and therefore may have bugs that are not present in the original.
Note: Please send bug reports or support requests to warly at mandriva.com.
Note: The author of cdrecord should not be bothered with problems in this
version.
Linux sg driver version: 3.5.33
Using libscg version 'schily-0.8'.
scsibus0:
0,0,0 0) 'Adaptec ' 'SYSTEM ' 'V1.0' Disk
0,1,0 1) 'Adaptec ' 'Data 1 ' 'V1.0' Disk
0,2,0 2) *
0,3,0 3) *
0,4,0 4) *
0,5,0 5) *
0,6,0 6) *
0,7,0 7) *

$ cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: Adaptec Model: SYSTEM Rev: V1.0
Type: Direct-Access ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 01 Lun: 00
Vendor: Adaptec Model: Data 1 Rev: V1.0
Type: Direct-Access ANSI SCSI revision: 02
-----------

The 0,0,0 is the /dev/sda. And even though this is now, after low-level
formatting of the previously inconsistent disc, the indications back then
were just the same. Which means every indication behaved as usual. Both
arrays were properly identified. But when I was accessing the inconsistent
one, i.e. /dev/sda, in any way (even just bytes, this has nothing to do
with any filesystems) the error messages mentioned above appeared. I'm not
sure what exactly was generating them, but I've CC'd Mark Salyzyn, maybe
he can explain more to it.

And now for the best part: From Windows, I was able to access the ENTIRE
VOLUME without the slightest problem. Not only did Windows boot entirely
from the /dev/sda1, but using Total Commander's ext3 plugin I was also able
to access the ENTIRE /dev/sda2 and at least extract the most important data
and configurations, before I did the complete low-level formatting of the
drive, which fixed the inconsistency problem.

I call this "AN IRONY" to be forced to use Windows to extract information
from Linux partition, wouldn't you? ;)

(Besides, even GRUB (using BIOS) accessed the /dev/sda without complications
- as it was the bootable volume. Only Linux failed here a 100%. :()

From the way you say sda when you presumably mean sda1 or sda2 it's not clear
if you don't understand the difference between drive and partition access or
are just so pissed off you are not taking the time to state the distinction
clearly.

No, I understand the differences very clearly. But maybe I was just
unclear in my expressions (for which I appologize). What I mean is that
the problem was with the entire RAID array /dev/sda. So whenever ANY
access to ANY part of /dev/sda, which of course also includes accesses to
all of /dev/sda1, /dev/sda2, and /dev/sda3, the error messages appeared
and no access was performed. That includes even accesses like this
"dd if=/dev/sda of=/dev/null bs=512 count=1" and any other possible
accesses. So the problem was with the entire device /dev/sda.

There was a problem with recovery from errors in RAID-5 which is addressed by
recent changes to fail a sector, try rewriting it, etc.

Maybe this was again my bad explanation, but this wasn't a problem of a
RAID 5 array, and much less of a software array. Adaptec 2410SA is a
4-channel HW SATA-I RAID controller.

I would have to read linux-raid archives to explain it, so I'll stop
with the overview. I don't think that's the issue here, you're using a
RAID controller rather than the software RAID, so it should not apply.

Yes, exactly. And again, I've solved it by lowlevel formatting.

I assume that the problem is gone, so we can't do any more analysis after the
fact.

Unfortunatelly, yes. But I've described above how did it happen, so maybe
someone in Adaptec would be able to reproduce, Mark?

Martin

====================================================
Martin Drab
Department of Solid State Engineering
Department of Mathematics
Faculty of Nuclear Sciences and Physical Engineering
Czech Technical University in Prague
Trojanova 13
120 00 Praha 2, Czech Republic
Tel: +420 22435 8649
Fax: +420 22435 8601
E-mail: drab@xxxxxxxxxxxxxxxxxxx
====================================================

Relevant Pages

  • Re: FYI: RAID5 unusably unstable through 2.6.14
    ... since my main system is running on that array ... because I've started the system and booted Linux from the array ... Adaptec controller found an inconsistency on the position somewhere in the ... RAID 5, so no problem) and reinstalled both systems in Array 2, and now it ...
    (Linux-Kernel)
  • Re: Deserate Call for help - RAID 5 on RHL9
    ... You sound like you're trying to do a Linux software RAID ... > partition etc) but the raid 5 array is reported as having a superblock ... remove the faulty disk and replace it ...
    (alt.os.linux.redhat)
  • Re: [SLE] Help with disk integrity and RAID-1 please
    ... I just zero its contents and add it to the array again? ... I understand the basic idea of RAID, ... > There is no such thing as a 'partial drive failure' on an IDE drive. ... O'Reily has 'Managing RAID on linux' ...
    (SuSE)
  • Re: RAID question
    ... Besides telling the controller to use them as a RAID array, ... > Is there any software for linux to detect disk failures on the RAID ...
    (Debian-User)
  • Software RAID-1 with Mandrake9.0 problems - LONG
    ... I have configured RAID -1 ... md: md0: raid array is not clean -- starting background reconstruction ... ...trying to set up timer as Virtual Wire IRQ... ... host bus clock speed is 132.8612 MHz. ...
    (comp.os.linux.setup)