RE: [SLE] Problems with initrd after mkinitrd



> -----Original Message-----
> From: Carl Hartung [mailto:suselinux@xxxxxxxxxxxxx]

<snip>

> As I'm sure you're aware ;-) roughly a minute after you posted this,
> Patrick
> wrote that he meant "sector 2082" and not "cylinder." (I'm not
convinced,
> however, that /he's/ convinced, so I'm afraid we'll both have to stay
> tuned...)

Ok so I looked -- that is cylinder 2082 (sorry about being a dolt but I
have a tendancy to *fuzz* up that which I don't absolutely need to know
after I've checked it out -- all I really knew, was that there should be
no issues with BIOS access to the disk. I appreciate your gentle
nudges, however, instead of calling me an idiot (which might have been
warranted here).

I am very interested in what you had said in another post about C,H,S
being defaulted in the BIOS, I am going to look through the Intel docs
on this to see if there is any reference. [after looking: this document:
ftp://download.intel.com/support/motherboards/server/se7520bd2/sb/se7520
bd2_server_board_tps_r23.pdf on page 62 indicates that LBA is the
default in the BIOS for devices that support it].

<snip>

> I don't know that Patrick was actually /using/ fdisk, only that
sectors
> are a
> natural metric for partitioning drives that use LBA.

Yes, I was using fdisk, but I had long since memorized the number for
scripts or hand editing, and hadn't looked at it in a while -- here is
an 'fdisk -l /dev/hda' for the curious:

Disk /dev/hda: 40.0 GB, 40007761920 bytes
16 heads, 63 sectors/track, 77520 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes

Device Boot Start End Blocks Id System
/dev/hda1 1 2081 1048792+ 82 Linux swap /
Solaris
/dev/hda2 * 2082 77520 38021256 83 Linux

<snip>

> - One, less frequent, is the failure of some of the cloned drives to
boot
> immediately after they've been created and installed. He is presently
> overcoming this 'flavor' of boot failure by reinstalling Grub.

Correct.

> - The second 'flavor,' which is occurring more frequently, is a
failure to
> boot after modifying the normally running system and creating a new
initrd
> (with mkinitrd.) IOW, these cloned drives have /not/ failed to boot,
> initially, and the systems have been running normally for some time.
Then,
> after installing updates, he's run mkinitrd and the systems suddenly
fail
> to
> boot. He is presently overcoming this 'flavor' by tarring up the drive
> contents, repartitioning the drive and restoring the contents.

Correct.

> In both cases, the boot failures are occurring *only* on the drives
that
> have
> been cloned. He is not experiencing either type of boot failure in
those
> systems where the drives have been installed raw and the systems have
been
> built and upgraded from scratch.

I'm sorry -- I haven't been clear enough. The second flavor occurs on
*all* drives after updates (where *all* means that the sample set can be
of either type (fresh-install or imaged) and produces the same 30%ish
hang rate after mkinitrd). I've sort of been thinking-out-loud in these
posts -- and in the process have been a bit confusing -- I apologize.
Some posts have made me realize I didn't think about some thing or
another, and so I look back at it -- I think the response was to the
thought that there might be a some kind of problem with BIOS accessing
the drive so -- I assumed that fresh installs are going to write near
the front of the disk first and subsequent updates would potentially get
written elsewhere. Then I later remembered that the partition started
at cylinder 2082 (but then confused cylinder and sector) and continued
to be perplexed. So my state at that point was: I thought that the
description of the problem (BIOS not able to read part of the disk where
the initrd resides) fit very well with the behavior I was seeing, even
though I *knew* that that particular problem *shouldn't* affect me and
thus came to the same hypothesis that Carlos did, which was: That some
potentially *unknown* BIOS problem that causes part of the disk to not
be accessible might be the issue.

The main behavior is intermittent, and while it could be hardware
related (BTW WD recommended us not to use the drive because the drive
was designed for a desktop (not server) duty cycle). Subsequent updates
on drives that hung previously sometimes hang and sometimes do not. In
my swamped-ness, I have looked now and again at *what* I was doing and
could find no *reason*. The most recent hang was due to an updated xfs
driver -- a minor modification of the source that we made in house and
compiled on a different machine. The module loads fine on a running
system and in this case was updated in the initrd without issue on 2 out
of 5 machines.

<snip>

> But he is building new systems using contemporary components that
support
> Logical Block Addressing. As I understand it, these systems should
have no
> difficulty booting from any location on a 40GB disk.

Exactly why I hadn't considered BIOS to be at issue.

> I agree that the BIOS limitation you're alluding to is still a common
> problem,
> but I think it only concerns older hardware than what Patrick is
dealing
> with. I am increasingly confident that the error lies somewhere in the
> realm
> of drive address calculations and/or translations.

What you state here (about calculations/translations), could explain the
problem quite nicely (and actually, I think, is a great fit for what
Carlos is saying).

> That is /my/ educated guess. I am still thinking about ways to test
the
> drives
> for proof, though. Any ideas?

I have the used blocks ( -D ) output from debugreiserfs from both a
working and a hung system. I think it has enough information to tell
what part of the filesystem the initrd is in -- I am not sure if it
indicates the LBA sectors. If it does, I will be able to glean that in
a *very* tedious process. But this still wouldn't answer your question,
I think, since it wouldn't tell us if the BIOS can properly address and
read those blocks. So, I am curious, too (but I am also very grateful
for the help I have already received and do not want to appear greedy).

<snip>

Thanks,

Patrick

--
Check the headers for your unsubscription address
For additional commands send e-mail to suse-linux-e-help@xxxxxxxx
Also check the archives at http://lists.suse.com
Please read the FAQs: suse-linux-e-faq@xxxxxxxx



Relevant Pages

  • Re: in lieu of swapping out hard drives
    ... I am skeptical that usb and eSATA are handled the same way in the BIOS. ... boot, I am given the option to boot from the external drive." ... I plugged and unplugged, boooted from local drives, etc. about a dozen ...
    (microsoft.public.windowsxp.general)
  • Re: [SLE] Problems with initrd after mkinitrd
    ... Not a problem, Patrick... ... "4MB Flash ROM with AMI* BIOS, Multiboot BBS ... - it brings the possibility back to life that these drives (or the entire IDE ...
    (SuSE)
  • Re: Grub hangs - two hard drives and a CD
    ... > What you wrote is all based on a wrong assumption that grub numbers drives ... > that the BIOS will call to boot an OS. ... The Bios calls to read disk require the register DL to ...
    (comp.os.linux.setup)
  • Re: Clean XP Install -- What is the problem?
    ... He's setting up his BIOS incorrectly. ... Press Enter at the Hard Disk Boot Priority item. ... Set On-Chip IDE Channel 0 to Disabled -- this will free system resources ... if you don't have any IDE drives connected to the system (this is not the ...
    (microsoft.public.windowsxp.hardware)
  • Re: Boot.ini question
    ... BIOS does not have such a concept. ... other BIOSes out there as well, do have such drives; ... there are other BIOSes which allow to boot ... to specify a logical drive in an extended dos partition to boot from, ...
    (comp.sys.ibm.pc.hardware.storage)