any guess why my software array didn't rebuild?

From: Robert Nagle (idiotprogrammer_at_yahoo.com)
Date: 01/19/04


Date: 18 Jan 2004 23:36:32 -0800

Hi, I had three 60gig drives on a webserver, with three software
arrays. Webmin (and other commands) showed a failure on hda, and so I
swapped out hda with a new hd of the exact same size and model. But
the machine wouldn't reboot. I'm resigned to installing again from
scratch (I made sure to do backups, etc), but I'm puzzled at this
thing. I checked the cables, verified that I was replacing the right
hard drive (and btw, I should add that replacing drives is a real
pain!).

(I posted a question about this problem here:
http://groups.google.com/groups?hl=en&lr=lang_en|lang_de&ie=UTF-8&oe=utf-8&safe=off&threadm=slrnbtnb61.ujd.The-Central-Scrutinizer%40turing.kaosol.net&rnum=5&prev=/groups%3Fq%3Didiotprogrammer%2540yahoo.com%26num%3D100%26hl%3Den%26lr%3Dlang_en%257Clang_de%26ie%3DUTF-8%26oe%3Dutf-8%26safe%3Doff%26sa%3DN%26scoring%3Dd
)

Normal behavior I'm assuming is that when you start the PC with the
new drive along with the two good drives, linux will start and
presumably give some "rebuilding the array" message. But my system
didn't do that; it gave no messages whatever. (But the bios show all
the hard drives there).
It didn't boot, but just showed a blank screen.

Anybody want to venture a guess about what went wrong?

Robert Nagle, idiotprogrammer
log trail below:

# cat /proc/mdstat
Personalities : [raid1] [raid5]
read_ahead 1024 sectors
md1 : active raid1 ide/host2/bus1/target0/lun0/part1[2]
ide/host2/bus0/target0/lun0/part1[1]
ide/host0/bus0/target0/lun0/part1[0]
96256 blocks [2/2] [UU]

md3 : active raid5 ide/host2/bus1/target0/lun0/part3[2]
ide/host2/bus0/target0/lun0/part3[1]
58588800 blocks level 5, 32k chunk, algorithm 2 [3/2] [_UU]

md4 : active raid5 ide/host2/bus1/target0/lun0/part4[2]
ide/host2/bus0/target0/lun0/part4[1]
ide/host0/bus0/target0/lun0/part4[0]
55512448 blocks level 5, 32k chunk, algorithm 2 [3/3] [UUU]

www root # lsraid -a /dev/md3
lsraid: Unable to allocate memory while querying md device
[dev 9, 3] /dev/md3: Cannot allocate memory

Filesystem 1K-blocks Used Available Use% Mounted on
/dev/md3 58587008 7710480 50876528 14% /
tmpfs 58587008 7710480 50876528 14%
/var/lib/init.d
/dev/md4 55510748 53822932 1687816 97% /home
none 451716 0 451716 0% /dev/shm

/dev/md3 on / type reiserfs (rw,notail)
proc on /proc type proc (rw)
none on /dev type devfs (rw)
tmpfs on /var/lib/init.d type tmpfs (rw,mode=0644,size=2048k)
/dev/md4 on /home type reiserfs (rw,notail)
none on /dev/shm type tmpfs (rw)

#raid5 for 3 60gig hd
#/boot
raiddev /dev/md1
raid-level 1
nr-raid-disks 2
nr-spare-disks 1
chunk-size 32
persistent-superblock 1
device /dev/hda1
raid-disk 0
device /dev/hde1
raid-disk 1
device /dev/hdg1
spare-disk 0

#note: swap is on hda2, hde2, hdg2
#/ partition
raiddev /dev/md3
raid-level 5
nr-raid-disks 3
nr-spare-disks 0
chunk-size 32
persistent-superblock 1
parity-algorithm left-symmetric
device /dev/hda3
raid-disk 0
device /dev/hde3
raid-disk 1
device /dev/hdg3
raid-disk 2

# /home partition raid5
raiddev /dev/md4
raid-level 5
nr-raid-disks 3
nr-spare-disks 0
persistent-superblock 1
parity-algorithm left-symmetric
chunk-size 32
device /dev/hda4
raid-disk 0
device /dev/hde4
raid-disk 1
device /dev/hdg4
raid-disk 2



Relevant Pages

  • Re: Fixing a Linux RAID 1 software mirror
    ... >00000000.00000000.00000000.00000000 missing ... > raid-disk 0 ... I ended up having to shutdown, swap drives, and boot back ... then partition the new drive and bring mirrors in sync. ...
    (Fedora)
  • Re: Moving raid to different machine
    ... I also moved them to machine C with FC3 as well and it ... using a mirror (2 mirrors and 2 drives actually), ... > different machine, get the raid going, and be able to pull the data ... > raid-disk 0 ...
    (Fedora)
  • Re: st3145a
    ... hda: ST3145A, ATA DISK drive ... port ZIP drives with a different interface which is ... VFS: Mounted root. ... Can't find an ext3 filesystem on dev fd. ...
    (comp.unix.bsd.openbsd.misc)
  • High memory pressure makes system nearly unresponsive.
    ... it gets, I tried to start bluefish while doing that simple "cat" test, ... is on hda, ... Both drives are using udma5, ... cat test without any noticeable slowdown of the system. ...
    (comp.os.linux.misc)
  • Strange performance results with diskperf and 15K drives...
    ... unit has two separate 4-bay arrays. ... For testing, I've been comparing various 10k and 15k rpm drives, 73GB ... for this test with a single disk). ... Integral SCSI controller 4: ...
    (comp.sys.sgi.hardware)