Determining where the fault lies when hardware fails?
- From: "Calab" <myspam@xxxxxx>
- Date: Fri, 08 Feb 2008 04:53:56 GMT
Running Debian Lenny here and had a failure on my RAID5 array (See my thread
"hdd: Drive not ready for command"). I've got the machine back up and
operating normally. I suspect it may be a software glitch, but I'm not going
to give up that easily.
I've looked through the various files found in /var/log and don't see any
reference to hardware issues.
When the machine failed I was left with the following:
Message from syslog@debian at Wed Feb 6 06:22:3 2008 ...
debian kernel: Disabling IRQ #20
hdd: Drive not ready for command
hdd: Drive not ready for command
hdd: Drive not ready for command
hdd: Drive not rea... etc. ad infinatum
At this point I rebooted, two of my drives came up "non-fresh" and were
kicked from the array:
kicking non-fresh sdd3 from array
unbind <sdd3>
export_rdev(sdd3)
kicking non-fresh sdc3 from array
unbind <sdc3>
export_rdev(sdc3)
...
raid5: device hda3 operational as raid disk0
raid5: device sdb3 operational as raid disk2
raid5: device sda3 operational as raid disk1
raid5: not enough operational devices for md0 (2/5 failed)
.... which I managed to work around by readding them to the array and
rebooting:
mdadm --add /dev/md0 /dev/sdc3 /dev/sdd3 -R
mdadm -w /dev/md0
reboot
,,, and all is well. Data is fine and the machine working normally. Now I
want to see know what failed.
Looking at the above, it seems like both sdc and sdd had issues. IIRC, both
of these drives are connected to the Promise SATA controller of my Asus
P4C800E-Dlx mainboard.
As I said before, I don't see any failure messages anywhere among the
various log files at /var/log...
Can someone shed a bit of light on this issue? Where should I expect to see
messages saved if the hardware was misbehaving?
Thanks!
.
- Follow-Ups:
- Re: Determining where the fault lies when hardware fails?
- From: George Peter Staplin
- Re: Determining where the fault lies when hardware fails?
- Prev by Date: Re: Old laptop HD to pick up with usb adaptor
- Next by Date: Re: bash question
- Previous by thread: bash question
- Next by thread: Re: Determining where the fault lies when hardware fails?
- Index(es):
Relevant Pages
|