Re: Problem booting RAID1/mdadm system when one disk is unplugged





On Fri, 1 Sep 2006, James Brown wrote:

Justin Piszcz wrote:


On Fri, 1 Sep 2006, James Brown wrote:

All,

I have already posted this question to gmane.linux.raid, but would really appreciate some help from a Debian perspective please...

My System has 2x120GB IDE disks with the an up-to-date Sarge install, running kernel 2.6.8-3 and configured for mirroring.

When I tested booting my system, I found:

a) A kernel panic unless both disks are plugged in.
b) One disk removes from the array each reboot.

The relevant logs are on this thread:
http://article.gmane.org/gmane.linux.raid/13033/match=newbie+kernel+panic+raid1

I was advised this was a problem with my initrd because it didn't contain a mdadm.conf file and presumably that I should make a new one.

Unfortunately, some friends of mine do not agree that my initrd is the problem because they point out that I can still boot when two disks are present. What do people here think?

Thanks in advance

James.


--
To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx with a subject of "unsubscribe". Trouble? Contact listmaster@xxxxxxxxxxxxxxxx



1) You do not need an initrd at all.

I'm actually a newbie and using GRUB as it was installed by default.

Will deleting the /boot/initrd.img-2.6.8-3-386 file and initrd instruction within menu.lst be sufficient?


No, you need to compile your own kernel with support for your hardware that you use.

2) I would make sure you are booting from the /dev/md* partition.

I *think* I am booting from the /dev/md0 partition:

# cat /boot/grub/menu.lst
[...]
# groot=(hd0,5)
[...]
title Debian GNU/Linux, kernel 2.6.8-3-386
root (hd0,5)
kernel /boot/vmlinuz-2.6.8-3-386 root=/dev/md0 ro
initrd /boot/initrd.img-2.6.8-3-386
savedefault
boot
[...]

That appears to be right, but I would ditch the initrd stuff and use a statically compiled kernel that you create for your hardware. Also, I use LILO here.


I understand this means the system will load the BootLoader from the first disk BIOS presents, and partition number 6 on that disk (in my case, hda6 or hdc6). It will then try to boot the kernel from /dev/md0, which it should manage:

# cat /etc/fstab
proc /proc proc defaults 0 0
/dev/md0 / ext3 defaults,errors=remount-ro 0 1
/dev/md1 /var/mail ext3 defaults 0 2
/dev/hda5 none swap sw 0 0
/dev/hdc5 none swap sw 0 0
/dev/hdd /media/cdrom0 iso9660 ro,user,noauto 0 0

I don't see a problem here, but please correct me if I've said somthing wrong.


A classic mistake. NEVER use swap on different drives. You always want to RAID1 the swap as well. If one disk dies and you are swapping to it, the integrity of the data can be at risk. Always keep your swap on RAID.

>> 3) You must have a /boot on a /dev/md partition. >
From looking at the above, /boot would fall under /dev/md0 (I think)?


It /looks/ right.

4) You need LILO installed with the special raid option in the config.

Maybe I should ditch GRUB and learn/install LILO.

Definitely.

If you are ever on IRC sometime, I may be able to better help you with your issues.

The next step you need to really do is compile your own kernel and get it working with RAID support and stop using an initrd/kernel.

Justin.


5) I have done this and tested by pulling each HDD out and it working
successfully with Debian + 2x74GB raptors.

Justin.



Thanks very much for your time.


--
To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx with a subject of "unsubscribe". Trouble? Contact listmaster@xxxxxxxxxxxxxxxx



--
To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx with a subject of "unsubscribe". Trouble? Contact listmaster@xxxxxxxxxxxxxxxx



Relevant Pages

  • Re: Problem booting RAID1/mdadm system when one disk is unplugged
    ... My System has 2x120GB IDE disks with the an up-to-date Sarge install, running kernel 2.6.8-3 and configured for mirroring. ... To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx with a subject of "unsubscribe". ... first disk BIOS presents, and partition number 6 on that disk. ... You always want to RAID1 the swap as well. ...
    (Debian-User)
  • Re: "Waiting for root file system ..."
    ... First of all I would try to provide the root kernel option. ... The harder way is, assuming you have bootable kernel and initrd, to run the ... limited shell), so if the new one does not work you can repeat the ... To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx ...
    (Debian-User)
  • Re: how to build Debian 2.4.20, -21, or -22 kernel package for woody?
    ... > Daniel B. wrote: ... >> Can anyone tell me what has changed in the kernel regarding initrd ... on which my current kernel depends. ... should it really be this hard to avoid disk ...
    (Debian-User)
  • Re: Re: Fwd: 2.6.26 and 2.6.29 kernel image , Couldnt find valid RAM disk image starting at 0
    ... When the another test machine booted correctly I copied manually the kernel and initrd image to the failing system and rerun lilo but I got the same booting error. ... io scheduler anticipatory registered ... To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx with a subject of "unsubscribe". ...
    (Debian-User)
  • Re: IO diagnose tools?
    ... As refered in kernel document Documentation/filesystems/proc.txt, ... If you want to find out which process caused the disk to spin up, ... Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org ... To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx ...
    (Debian-User)