Re: [ squeeze ] Grub2 RAID1 LVM2 boot failure



d.sastre.medina@xxxxxxxxx wrote:

On Sat, May 29, 2010 at 05:44:22PM -0400, Tom H wrote:
On Sat, May 29, 2010 at 7:06 AM, David Sastre Medina
<d.sastre.medina@xxxxxxxxx> wrote:

Grub2 is failing to boot a softRAID1 + LVM2 squeeze box.

I use an equivalent setup and it was all automatically setup
correctly with the `update-grub2` command (once the system has
booted correctly).

Keep reading.

I have an 'md1' as '/boot' and an lvm2 '/' on 'md2', this is what my system uses:

grub.cfg:
menuentry "Debian GNU/Linux, with Linux 2.6.32-trunk-amd64" --class debian --class gnu-linux --class gnu --class os {
insmod raid
insmod mdraid
insmod ext2
set root='(md1)'
search --no-floppy --fs-uuid --set 1be9c4e5-70cd-4662-81e6-44e76cff20d8
echo Loading Linux 2.6.32-trunk-amd64 ...
linux /vmlinuz-2.6.32-trunk-amd64 root=UUID=25defa7a-93cb-40eb-9a76-c326f0b2dffc ro vga=792
echo Loading initial ramdisk ...
initrd /initrd.img-2.6.32-trunk-amd64

blkid: `blkid /dev/md[1,2]` Use blkid -g first to clear any old stored key.
/dev/md1: UUID="1be9c4e5-70cd-4662-81e6-44e76cff20d8" TYPE="ext2"
/dev/md2: UUID="25defa7a-93cb-40eb-9a76-c326f0b2dffc" TYPE="ext2"

grub-probe: grub-probe -t fs_uuid /boot, grub-probe -t fs_uuid /
1be9c4e5-70cd-4662-81e6-44e76cff20d8
25defa7a-93cb-40eb-9a76-c326f0b2dffc

mdadm: `sudo mdadm -D /dev/md[1,2] | grep UUID`
UUID : ff7e23a3:dc6327b6:73d158fc:63c6b3dd
UUID : 157b664b:7b41974f:73d158fc:63c6b3dd

It's booting fine all the time.

root@sysresccd /root % mdadm --detail /dev/md0 /dev/md0:

UUID : 8052f7d4:54a97fbb:731031f6:bc3d041c

That UUID it's not the same that grub will use for boot.

I see two possible problems when looking at your grub.cfg.

1. There isn't an "insmod lvm" within the menuentry stanza. ext2,
raid, and mdraid are insmod'd twice in the header and once in the
menuentry and lvm is inmod'd just once in the header. (This is one of
the grub2 mysteries; why multiple insmods of the same modules?). I
doubt that this is the source of the problem (the first insmod must be
enough!) but you could add "insmod lvm" within the menuentry.

Already tried that. No success.

That is not your problem IMO.

2. In the uuid of the search line, what is
785366b0-d597-4e9c-9284-b6b9161236ed? One of your /dev/sX1's uuid?
Since raid and mdraid are loaded, can't you/shouldn't you use the md0
uuid above?

I also tried that. It fails.
That UUID belongs to /root_vg-root_lv, where the root filesystem
resides.
The UUID can be confirmed at the grub propmt issuing
grub> ls (root_vg-root_ls)

No, the `root` partition from the point of view of grub is the partition
where it is going to boot, i.e. /boot, then, the kernel will need the
`root` FS to use, that will be the UUID for /root_vg-root_lv in the `linux`
line.

Note that `boot' is a multidisk partition (sda1 and sdb1, which assemble
md0), thus root='(md0)' makes sense from a grub point of view.

Correct.

And md1 is the result of assembling sda2 and sdb2. This md device has only
one VG on top of it, root_vg, with several LVs in it, one of these LVs
being my root_lv.

That looks OK.

This my default menuentry now:
menuentry "Debian GNU/Linux, with Linux 2.6.32-3-686-bigmem" --class debian --class gnu-linux --class gnu --class os {
insmod raid
insmod mdraid
insmod lvm
insmod ext2
set root='(md0)'
search --no-floppy --fs-uuid --set 785366b0-d597-4e9c-9284-b6b9161236ed
echo Loading Linux 2.6.32-3-686-bigmem ...
linux /vmlinuz-2.6.32-3-686-bigmem root=/dev/mapper/root_vg-root_lv ro rootdelay=15 quiet
echo Loading initial ramdisk ...
initrd /initrd.img-2.6.32-3-686-bigmem
}

The `set root' entry says what is *root* for grub, I understand this as:
where are /boot/grub/grub.cfg, /vmlinuz-`uname -r` and /initrd.img-`uname
-r` So IMHO it should be called boot='(md0)' for better undestanding and
disambiguation from the *other* root in the `linux' line.

Yes, that's exactly it.

The GRUB root device is not the same as the Linux kernel root= parameter.
BTW this command is undocummented in the wiki, still uses grub-legacy's
info, which doesn't apply anymore, given the `root' command has been
replaced.

But grub has nothing to do with this parameter, it is a kernel `boot parameter`
well, more of a initrd boot parameter, but that is a different area:
http://www.mjmwired.net/kernel/Documentation/kernel-parameters.txt
line 2193.

The `search' line, as stated in the grub wiki:

Search devices by file, filesystem label or filesystem UUID. If --set
is specified, the first device found is set to a variable. If HD
variable name is specified, "root" is used.

I believe there is a mistake, and, that the `HD` should be `NO`. Meaning
that if no variable name is supplied, the value is assigned to the `root` variable.

This effectively repeats what the previous command did, IMO.

I take this to mean that the first device found _which UUID is_ 785...
(the UUID of my root_gv-root_lv) will be the `root' filesystem.

Well, the root for grub, not the root for the kernel.

And yet another definition of `root' after the `linux' call.
That one states that:

root=/dev/mapper/root_vg-root_lv which could be written also as:
root=LABEL=root or even
root=UUID=785366b0-d597-4e9c-9284-b6b9161236ed

Yes, all are correct and I strongly recommend to use the UUID value from the blkid command.

Warning: The command blkid needs a `blkid -g` first to clear the stored UUIDs in it's cache.

The three of them should be right. None of them work.

Your problem seems to be that the KERNEL can't find the root
FileSystem, nothing that grub could do to solve it.

If a suppress the `quiet' option from the `linux' line, what I can see
is LVM initializing *before* mdadm has get its job done:

"Volume group "root_vg-root_lv not found
Skipping volume group root_vg
Unable to find LVM volume root_vg-swap_lv
mdadm:/dev/md0 has been started with two drives
mdadm:/dev/md1 has been started with two drives
Gave up waiting fot root device."

That confirms it, it's a kernel problem not finding the correct `root` filesystem.
Use blkid UUID on that line.

So it looks like a timming issue *but*, I have tried to issue manually
the commands in the right order at the grub prompt:
1) insmod-ing raid, mdraid, lvm and ext2; setting root to md0;
2) searching for devices (also a variant without this step);
3) calling linux with the right root device
(all three variants of this step: dev name, UUID and LABEL and with
different rootdelay timmings, always without `quiet') and, finally;
4) calling initrd.

Failure again. No way root_vg to be found.

Once you have booted into this system, `update-grub` should set
this file correctly, grub.cfg will be updated on any kernel change.

Make sure `update-grub` is correctly creating a good grub.cfg before a re-boot.

One further question: after a reboot, while at the grub screen, before
doing anything else, if a enter the command line and type `ls' at the
prompt, I can see all of my LVs, and listing anyone of them returns:
device name, filesystem type, label, last modification time and UUID.
Where does this info come from? Supossedly, there aren't mods loaded to
read that yet, until after `insmod' loads them, are there?

That's the 'core.img' code for grub, which needs to correctly read
all UUIDs to really perform it's job correctly.


--
Antonio Perez


--
To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx
with a subject of "unsubscribe". Trouble? Contact listmaster@xxxxxxxxxxxxxxxx
Archive: http://lists.debian.org/2301386.kC03pvyZki@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx



Relevant Pages

  • Re: [ squeeze ] Grub2 RAID1 LVM2 boot failure
    ... What happens when you use LILO instead of Grub? ... There isn't an "insmod lvm" within the menuentry stanza. ... The UUID can be confirmed at the grub propmt issuing ... The `set root' entry says what is *root* for grub, ...
    (Debian-User)
  • Re: wireless centrino
    ... >>The command to load a module is called insmod, ... > must compile from kernel source or install kernel source from your ... Usually normal users change to root with command ... then you should get root's env variables and insmod and modprobe should work. ...
    (Fedora)
  • Re: booting fedora 11 from a fedora 10 grub?
    ... the UUID is just looked at for each volume until a match is found ... Try the `blkid` command; see `man blkid`. ... Perhaps that is not the correct name, or your "root" line is wrong. ...
    (Fedora)
  • Re: Beep gone in FC2?
    ... >> Sorry, modprobe not insmod. ... >> Become root, and type modprobe pcspkr. ... Or use the full path to the command - /sbin/modprobe ...
    (Fedora)
  • Re: Apple recommending anti-virus software for Macs?
    ... > To be ultra-safe with the 'rm' command, ... Not a bad idea for root, It would drive me nuts in my user account. ... downloads directory and executing it. ... That I type an EOF is a trivial difference versus 'sudo' exiting ...
    (comp.sys.mac.system)