Re: Aptitude/Grub Problem -- Is this a bug?



On Friday 24 February 2006 13:24, Justin Guerin wrote:
On Thursday 23 February 2006 18:23, Hal Vaughan wrote:
I posted earlier this week about some problems I had after doing:

aptitude update && aptitude upgrade

on a Sarge system. It required rebooting and was immediately
unbootable -- ON SARGE!!! This is the very stuff I am using stable
to avoid!

I lost a day tracking it down and finally found that when a kernel
image is updated, update-grub is run. Normally when apt/dpkg or
whatever part of apt actually upgrades a program and needs to
update a config file, it gives you a choice of updating or sticking
with the old file, or, at the very least, gives you a prompt and
warns you of the change. However, when a kernel image is updated,
it does not do ANY of these things. It doesn't warn you to back up
the /boot/grub/menu.lst file, it doesn't back it up itself, and it
does not, in any way, let you know it is doing this.

You aren't given a choice of keeping your old grub config file,
because without an update, you can't boot the new kernel. Well, OK,
you can, if you manually create the entry at the grub prompt, but you
know what I mean.

In this case, it's the same kernel image (again, I'm only upgrading
Sarge for security and bug fixes), so menu.lst did not need to be
changed to load a patched version of the same kernel version.

You aren't warned about update-grub removing an entry for a kernel,
because this is only done when you remove a kernel. If you've
removed a kernel, but don't remove its grub entry, then you've got an
entry that you can't use to boot. You don't want that.

It didn't just remove an entry. Update-grub completely overwrites the
file so any entries for kernels on other partitions are gone.

Picture this: you have 5 partitions, each with a different OS or
different Linux distros and different kernel versions on them. One
partition is your production partition, the one that HAS to always
work, so you use Sarge for it because upgrades/updates in Sarge are not
supposed to mess anything up. Do an "aptitude update && aptitude
upgrade" on your Sarge partition and, at least on my recent one,
aptitude finds a updated version of the kernel image you're using, so
it downloads and installs it. Now, since it's Sarge, so you're not
adding anything in an upgrade, and it is only replacing the same kernel
image. That means the same entry in menu.lst will work for the
replacement kernel (same is true if only modules are upgraded).

Menu.lst is replaced anyway, which wipes out the entries for kernels and
OSes on the other 4 partitions and any custom options for that
particular kernel as well as custom options for any other kernels on
that partition.

Since this happened, I found that it is possible, in menu.lst, to
specify the default kernel options that are used and a few other
features so update-grub will use the config options I need when it
updates menu.lst, so (I think) I am protected on that for now.

The issue is that one has to FIND the additional options to fix the
situation and prevent a change that keeps your system from booting.
There is nothing, anywhere, to alert a sys admin that this will happen
and must be taken into account.

I know some users know every detail of their systems, but I can't
do that. I have a business to run and I started using Debian
Stable because it is supposed to not mess with things when it
upgrades. I could not find anything warning me of this. It turns
out there is documentation in updategrub's man file that I have
since used to make sure the options I've put in the list of boot
kernels is kept, but through testing, I've seen updategrub will
wipe out all entries for other kernels not the current root
partition (and this happens whenever apt upgrades the kernel
image).

I'm not sure of your exact situation, but my experience with
update-grub is that it only creates or keeps entries for kernels it
thinks are installed.

That's what I've found -- and only kernels on the current partition. It
has little intelligence or ability to find kernels on other partitions
or to even scan the current list of entries and copy them (even copy
them commented out) into the new version.

I don't know whether or not update-grub depends
on apt's database, or if it just searches for kernels, but the source
would surely tell you.

It seems to just search for kernels on the current boot or system
partition.

Considering that the intent of stable is to make it so reliable one
can upgrade and count on the system continuing to work well, I
cannot see how this lack of warning (and not making a backup) as
anything other than a serious bug. It could be easily fixed by
prompting the user with a warning menu.lst is about to be
overwritten, so there's time to back it up. Even better the
standard prompt for whether or not to overwrite a config file would
be nice, since it would let the user decide to update menu.lst or
not (or maybe back it up).

Is this not a bug? Was I just supposed to somehow know that out of
all the packages out there, this was a specific behavior in
upgrading the kernel? It makes me wonder how many other exceptions
are out there that I don't know about that could crash my system
next time I upgrade.

Do others feel a prompt would be appropriate in this case? I'd
like to hear feedback before I submit it as a bug, since there may
be some good reasons for doing this, however, I cannot imagine a
single good reason for overwriting a file this important without at
least telling the user/admin that it is happening.

Hal

What kernel package updated? If your kernel is installed because of
a package like linux-image-2.6-686, then I might understand what
happened here.

kernel-image-2.6.8-2-686, which includes the full version number, which
is, I *think* not the same as 2.6.

That is a dependency package. When you install that
package with aptitude, it pulls in the relevant kernel as a
dependency, and marks it as being automatically installed to satisfy
dependencies. When that package updates, and points to a new kernel
package, then aptitude removes the old kernel, since it was only
installed to satisfy a dependency, and installs the new package. In
this case, your working kernel will be removed (along with it's grub
entry), and the new kernel will be put in its place. If something
fails in this operation, you would get an unbootable system (if that
was your only kernel).

I see that, but in this case, the specific version of the 2.6 kernel is
specified. From what I read, that is an actual package.

The solution is to mark the kernel your using as manually installed,
so that it is not removed when it is no longer needed by any other
package.

Actually, at this point, I've fixed menu.lst so the options I need will
be automatically included. But I still cannot see how overwriting such
an important file without backing it up or prompting can not be a bug.

...
So what kernel were you using, via what package, and what kernel did
you upgrade to, via what package, and did aptitude warn you it was
removing the older kernel? You don't mention this, but I'd be
surprised if it did and you missed it.

It wasn't a kernel upgrade. I'm not sure why aptitude actually calls it
an upgrade. It was "aptitude update && aptitude upgrade", which pulls
down the latest packages, which, on Sarge, means only updating bug and
security fixes. So, while it is called an upgrade, it should not have
actually upgraded to a new kernel version. No kernel was removed.

Even if I were upgrading to a new kernel, it seems to me if I am not
removing an earlier version, it should still keep that entry intact.

Thanks for the comments and for the ideas. I still think this is a bug,
under the category of oversight. I can see the need to re-write the
file, but that is a problem when it can just dump a lot of
configuration settings for kernels on other partitions or special
settings for other kernels.

Hal


--
To UNSUBSCRIBE, email to debian-user-REQUEST@xxxxxxxxxxxxxxxx
with a subject of "unsubscribe". Trouble? Contact listmaster@xxxxxxxxxxxxxxxx



Relevant Pages

  • Re: Cant install Sarge on an HP NetServer
    ... After upgrading to Sarge, I ... Ok, so I can install a base Woody, then add the Sarge cd-rom ... May I also upgrade to a new kernel? ...
    (Debian-User)
  • Re: Aptitude/Grub Problem -- Is this a bug?
    ... aptitude update && aptitude upgrade ... unbootable -- ON SARGE!!! ... I lost a day tracking it down and finally found that when a kernel ... one can upgrade and count on the system continuing to work well, ...
    (Debian-User)
  • Sarge: mondo problems
    ... I have just upgraded from woody to sarge. ... I have just upgraded from woody to sarge and have removed my custom ... to run it with kernel version 2.4.27-2-686 it bombs, ... I am instructed to compile it into my kernel, or that I can upgrade to a 2.6 ...
    (Debian-User)
  • Re: Aptitude/Grub Problem -- Is this a bug?
    ... aptitude update && aptitude upgrade ... I lost a day tracking it down and finally found that when a kernel image ... upgrade and count on the system continuing to work well, ... What kernel package updated? ...
    (Debian-User)
  • Re: Aptitude/Grub Problem -- Is this a bug?
    ... aptitude update && aptitude upgrade ... unbootable -- ON SARGE!!! ... I lost a day tracking it down and finally found that when a kernel ... can upgrade and count on the system continuing to work well, ...
    (Debian-User)