Old version of lilo fails to boot 2.6.23




Greetings:

I upgraded to version 2.6.23 and had a fun time figuring out the source of
this boot failure message on my x86 system:

This kernel requires an i<random integer>86 CPU, but only detected an
i<smaller random integer>86 CPU.

It turns out that my version of lilo (lilo -V gives version 21) doesn't set
up the stack and data segment registers in a compatible manner before
entering the new 16-bit real mode kernel loader code. This problem is new
to the 2.6.23 series.

Parts of the 16-bit real mode loader code are now being compiled as C code
with gcc in 32 bit mode passing the .code16gcc directive to the assembler to
correct the stack frames to 16 bit. This kludge won't work unless all the
16-bit segment registers are set to the same value. Gcc only manipulates
the offset of the address and doesn't know anything about segment registers
or segment override prefixes. My lilo was setting SS=0x8000, DS=0x9000, and
SP=0xB000 before entering the kernel loader. This makes stack automatics
unreachable from the data segment without segment override prefixes.

I was tempted to patch the kernel code, but instead decided to try
"upgrading" lilo to grub-0.97 and found that grub works just fine. This
also has the significant advantage that we won't need those nasty as86 and
ld86 things any more since lilo was the last package on our systems that
used them.

However, it would probably be a good idea to modify the kernel loader to
lock out interrupts and explicitly set up the stack in its assembly startup
code to insure that the stack is located correctly above the code in the
same segment, rather than relying on the boot loader to do the right thing. The existing setup code already insures that the other segment registers are
equal but omits the stack segment register. Also, because lilo (and
others?) loads the data/code segment at 0X90000, the stack pointer would
have to be set no higher than 0XA000 to avoid potential overwrites of the
EBDA. But I believe from my look at the code that the data/code sits below
0X8000 in the segment, so this should be fine.

If others think this is a good thing, I will test and submit a patch.

Please CC me directly as I am no longer subscribed to the list.


Best regards,

Joseph
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: Questions about Minix
    ... Of course you still need to allocate chunks of contiguous memory, ... in user mode and at kernel level if you need dynamic ... I read "data segment". ... a stack from growing into the heap (only a possibility of detecting it ...
    (comp.os.minix)
  • Re: Old version of lilo fails to boot 2.6.23
    ... correct the stack frames to 16 bit. ... 16-bit segment registers are set to the same value. ... "upgrading" lilo to grub-0.97 and found that grub works just fine. ... The boot protocol was pretty much formalized by Werner Amsberger, the original LILO author, with contributions from Hans Lermen and myself. ...
    (Linux-Kernel)
  • Re: Old version of lilo fails to boot 2.6.23
    ... It turns out that my version of lilo doesn't set ... entering the new 16-bit real mode kernel loader code. ... correct the stack frames to 16 bit. ... 16-bit segment registers are set to the same value. ...
    (Linux-Kernel)
  • Re: [9fans] Race condition in /sys/src/9/pc/trap.c?
    ... not have access to this proc's stack. ... What if sp points inside a segment which is not the actual stack segment? ... the kernel always moves data to/from user ...
    (comp.os.plan9)
  • Re: How to develop a random number generation device
    ... but if you can manage to create a buffer overflow in a kernel process ... (the TCP/IP stack being a common target here, ... A messed up data segment is still the data segment. ...
    (sci.electronics.design)