RE: GRUB failure

From: Kenneth Goodwin (kgoodwin_at_datamarktech.com)
Date: 08/02/03

  • Next message: Kenneth Goodwin: "RE: GRUB failure"
    To: <redhat-list@redhat.com>
    Date: Fri, 1 Aug 2003 19:38:59 -0400
    
    

    > Maybe I misunderstand here, but this seems to be a
    situation
    > where -
    >
    > A working Linux system using GRUB with MBR on HD-A,
    .conf
    > file on HD-A
    > had a new disk drive HD-B added to it.
    >
    > They did nothing apparently to reconfigure the box
    > bootwise.
    > They added the filesystems onto HD-B and used them for
    > whatever.
    >
    > HD-B croaked.
    >
    > When they removed HD-B from the ide BUS, things with
    GRUB
    > broke
    > AND THEY DONT UNDERSTAND WHY?
    >
    > Since everything for GRUB should be on HD-A which is
    still
    > operational.....
    > Grub should not have a problem locating the .conf file
    > since it should still
    > exist on HD-A.
    >
    > Why would GRUB whose ENTIRE RESOURCES SHOULD BE STILL
    INTACT
    > On DRIVE A
    > fail to work when DRIVE B is Removed from the IDE Chain?
    > > >
    > > > > The problem is, I have understood the problem.
    You
    > > can't boot when GRUB
    > > > > can't find grub.conf.
    > > >
    > > > GRUB doesn't even come that far as was explained in
    the
    > first
    > > > message.
    > > >
    > > > If it booted into GRUB shell and then wouldn't find
    the
    > > config file,
    > > > that would be a minor problem.
    > > >
    > > > > You can when it can. When you remove hdb GRUB
    > > > > can't find the .conf file and that is why you
    can't
    > boot.
    >

    >>OTTO's REPLY ----
    > I agree with the what you are saying. Something is
    changing
    > when HDB is
    > removed. I don't know what. I can tell you that the
    same thing will
    > happen if you add/delete a partition on 'HAD'. We
    > conjectured that it
    > could be hardware, but they say its not so I don't know,
    but
    > we do know
    > for a fact that it can't locate the .conf file by what
    projection has
    > taken place. It could be as simple as the partition are
    being
    > renumbered or as complicated as a GRUB problem. But the
    addition and
    > removal of the drive is changing something.

    I have not seen grub source, but Grub is apparently a
    multi-load phase boot loader,
    the MBR piece loads the rest of the boot loader
    off of the /boot partition which then loads the OS etc.

    Okay, Most boot loaders do not record things as Partition
    numbers in the MBR piece.
    Ie from an OS / LINUX perspective of hardware, it wont be
    /dev/hda00 or whatever
    in the MBR.

    Instead, It will be hardcoded as a disk drive number from a
    hardware perspective
    IDE Controller, unit select info, the stuff you see in the
    messages file
    during boot and a
    Block offset on the disk from whence to start looking for
    the superblock.
    This will be the first block number on the ide disk that
    /boot starts physically at.

    However, The MBR record should have the correct pointers to
    the
    HD-A drive and /boot partition since the system was
    initially setup that way.
    If you stick drive B back on the bus and change nothing
    else, things work again
    from what I understand here. I think your GRUB prompt is
    coming out of the MBR piece
    but you would have to check the source code.

    Since the /boot filesystem
    is where the CONF file and the rest of grub is apparently
    located and is supposed
    to be the first partition on the drive this should always
    work.

    Now if you make /boot further in on the drive and then
    change the sizes of the partition(s)
    before the /boot partition and dont tell grub's MBR piece
    about the change
    you will have the kinda problem that Otto apparently had
    with Grub, thats because
    grub cant find the real start of the /boot partition since
    the block offset into the
    disk just changed. Otto would have to confirm that was the
    case in his situation.

    I think that Otto was on the right track, but was not
    explaining his position clearly.
    The issue here is apparently why does adding in Drive B Not
    mess up
    the MBR's sense of where Drive A is on the IDE chain , but
    when drive B is removed
    the MBR can't find drive A.

    There must be a radical shift in how the BIOS is dealing
    with the drives
    since the original create of LINUX in a single drive
    configuration, the addition
    of the second drive and subsequent removal there of.

    (Of course, this is only a theory mind you...)

    Sort of an example to give you the idea of where I am going
    here -

            At Create time of LINUX

                    Drive A - bus address 123456 - recorded that way in MBR

            Add in Drive B
                            ROM BIOS reconfiguration of it's device tables.....

                            Drive A - bus address 123456 MBR points to 123456
                            Drive B - bus address 123455

            Remove drive B -

                            ROM BIOS reconfiguration of it's device tables.....

                            Drive A - bus address 123454 MBR points to 123456

                            Rom Bios loads MBR, MBR tries to load rest of Grub
                            from disk at bus address 123456 and fails
                            (no such device or address since BIOS changed the base
    address
                            of drive A for whatever reason during reconfiguration.)

            reinstall drive B -
                            ROM BIOS reconfiguration of it's device tables.....

                            Drive A - bus address 123456 MBR points to 123456
                            Drive B - bus address 123455

                            Rom Bios loads MBR, MBR tries to load rest of Grub
                            from disk at bus address 123456 and succeeds

            Bios restores drive A address to original create time
    settings.

    In other words, this is probably a hardware/bios issue and
    not anything wrong with GRUB itself. You are somehow
    changing the low level
    device references for the A Drive by removing the B drive
    and the change in address is not consistent with the address
    used for the original install
    but somehow it resets to those values by reinserting the B
    drive on the bus.

    I guess you could put debug into grub to dump out the values
    it gets from the BIOS initially to see how they change. More
    than
    likely this is a BIOS issue, the question is How to verify
    or disprove it.
    If we know this can happen, perhaps GRUB's MBR piece can be
    modified to deal with it
    by its creators.

    The bus address references could simply be the index into
    the bios drive tables plus their
    address. You have to somehow account for scsi controllers
    and ide controllers
    and booting off the scsi controller drives, so it has to be
    more than a table index
    in the MBR. The "address" has to point to the controller
    address and enough info
    to fetch data off of the /boot sector for phase two of the
    load.

    I have not done any OS development on PC's so take this with
    a grain of salt....

    -- 
    redhat-list mailing list
    unsubscribe mailto:redhat-list-request@redhat.com?subject=unsubscribe
    https://www.redhat.com/mailman/listinfo/redhat-list
    

  • Next message: Kenneth Goodwin: "RE: GRUB failure"

    Relevant Pages

    • Re: Boot Manager
      ... No boot manager that usurps the MBR bootstrap area will cooperate with a drive overlay manager which also usurps the MBR bootstrap area. ... If you actually need to use the drive overlay manager, you'll need to find a boot manager that does NOT use the MBR bootstrap area (which means it might requires its own partition or share one). ... You might need to consider getting a new motherboard with a newer BIOS that can actually support the large drives that you want to use, or use an IDE controller card that has newer BIOS to support the larger drives. ...
      (microsoft.public.windowsxp.general)
    • Re: dual booting XP and Linux
      ... /dev/hda - MBR ... I don't worry too much now, since I'm booting fine into GRUB and from there ... > Millions of people are using to grub to boot Linux and/or Windows.. ... I like the idea of removing the hard drives because it is safe. ...
      (Fedora)
    • Re: Installation Question
      ... an extended partition as /dev/hda4, and a /home partition, sized as you ... Install grub in the MBR. ... With the boot partition unmounted, any updates to the kernel will be made ... grub only counts hard drives, so your second drive will be no matter ...
      (alt.os.linux.suse)
    • [opensuse] hosed my MBR
      ... Hosed my MBR ... ignoring extra data in partition table 5 ... My hdb* drive has all my backup linux so no loss. ... If I interchanged the hda and hdb drives, would the new hda have a good MBR? ...
      (SuSE)
    • Re: Reconfigure to boot from extended partition?
      ... about the "bootable" flag on the partition table, ... you can install either grub or lilo on the superblock of a partition, ... I managed to rip out the drives from my old 450Mhz Compaq ...
      (Ubuntu)