Re: 2.6.12-rc1-mm3

From: Borislav Petkov (petkov_at_uni-muenster.de)
Date: 03/31/05

  • Next message: Niu YaWei: "[PATCH] possible bug in quota format v2 support"
    To: Andrew Morton <akpm@osdl.org>
    Date:	Thu, 31 Mar 2005 15:05:48 +0200
    
    

    On Friday 25 March 2005 17:46, Borislav Petkov wrote:
    > Hi Andrew,
    >
    > mm3 still not booting on my machine. Boot option 'nmi_watchdog=2' (my cpu
    > is a dual core pentium 4 HT, 2.60 GHz) gets me a bit further in the boot
    > process but it blocks there too.
    >
    > [output retyped from screen]:
    > kernel: [ 4.109241] PM: Checking swsusp image.
    > kernel: [ 4.109244] PM: Resume from disk failed.
    > kernel: [ 4.112220] VFS: Mounted root (ext2 filesystem) readonly.
    > kernel: [ 4.112465] Freeing unused kernel memory: 188k freed
    > kernel: [ 4.142002] logips2pp: Detected unknown logitech mouse model 1
    > kernel: [ 4.274620] input: PS/2 Logitech Mouse on isa0060/serio1
    > <--- [point of previous blocks without boot option 'nmi_watchdog=2']--->
    > INIT: version 2.86 booting
    > Mounting a tmpfs over /dev... done.
    > Creating initial device nodes... done.
    > Setting parameters of disc: (none).
    > Activating swap.
    > kernel: [ 10.712648] Adding 976744k swap on /dev/hda2. Priority:-1
    > extents:1 Checking root file system...
    > fsck 1.36 (05-Feb-2005)
    > /: clean, 127290/1831424 files, 898566/3662056 blocks
    > [EOF]

    Hi Andrew,

    i finally got to run kdb within mm3 and I got a bit further but am not sure
    whether I'm debugging in the right direction:

    After booting with "kdb=early" I found out that the kernel blocks with the
    partial message:

    kmem_cache_create: Early error in slab task_struct
    kernel BUG at mm/slab.c:1215
    invalid operand: 0000 [#1]
    PREEMPT SMP

    and here all dies. After singlestepping through the code, I found out that
    start_kernel calls at offset 0x14d fork_init, which calls at offset 0x39
    kmem_cache_create. kmem_cache_create performs some initial checks:

       1205 /*
       1206 * Sanity checks... these are all serious usage bugs.
       1207 */
       1208 if ((!name) ||
       1209 in_interrupt() ||
       1210 (size < BYTES_PER_WORD) ||
       1211 (size > (1<<MAX_OBJ_ORDER)*PAGE_SIZE) ||
       1212 (dtor && !ctor)) {
       1213 printk(KERN_ERR "%s: Early error in slab %s\n",
       1214 __FUNCTION__, name);
       1215 BUG();
       1216 }

    And after singlestepping a little bit more, I found out that the
    in_interrupt() check returns true and printk is executed. Here's the
    disassembled code:

    00000b10 <kmem_cache_create>:
    kmem_cache_create():
    mm/slab.c:1201
         b10: 55 push %ebp
         b11: 89 e5 mov %esp,%ebp
         b13: 57 push %edi
         b14: 56 push %esi
         b15: 53 push %ebx
         b16: 83 ec 3c sub $0x3c,%esp
    mm/slab.c:1208
         b19: 8b 55 08 mov 0x8(%ebp),%edx
         b1c: 85 d2 test %edx,%edx
         b1e: 0f 84 01 08 00 00 je 1325 <kmem_cache_create+0x815>
    include/asm/thread_info.h:91
         b24: b8 00 f0 ff ff mov $0xfffff000,%eax
         b29: 21 e0 and %esp,%eax
    include/asm/thread_info.h:89
         b2b: f7 40 14 00 ff ff 0f testl $0xfffff00,0x14(%eax)
         b32: 0f 85 ed 07 00 00 jne 1325 <kmem_cache_create+0x815> <---

    this jump here executes and line 0x1325 is:

    mm/slab.c:1213
        1325: c7 04 24 40 02 00 00 movl $0x240,(%esp)
        132c: 8b 45 08 mov 0x8(%ebp),%eax
        132f: 89 44 24 08 mov %eax,0x8(%esp)
        1333: b8 0d 00 00 00 mov $0xd,%eax
        1338: 89 44 24 04 mov %eax,0x4(%esp)
        133c: e8 fc ff ff ff call 133d <kmem_cache_create+0x82d>
    mm/slab.c:1215
        1341: 0f 0b ud2a
        1343: bf 04 00 00 00 mov $0x4,%edi
        1348: 00 e9 add %ch,%cl
        134a: 13 f8 adc %eax,%edi
        134c: ff (bad)
        134d: ff c7 inc %edi

    and BUG() is called.

    Any suggestions or corrections will be greatly appreciated.

    Regards,
    Boris.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Niu YaWei: "[PATCH] possible bug in quota format v2 support"

    Relevant Pages

    • Re: [Patch]Fix an error in copy_page_range
      ... > Although maybe this bug does not break anything currently..., ... Andrew and/or Linus, ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: [PATCH] fix spurious OOM kills
      ... The only thing I know about recent swap-token changes, is that Andrew ... early-oom failures in my queue (the VM bug was introduced sometime ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: Odd OpenSolaris extremely high load average bug
      ... So this does not appear to be a display bug, ... hmm... ... I probably need to look at it with kmdb, interrupting kernel during ... various points of the booting, ...
      (comp.unix.solaris)
    • Re: Let us set the record staight Steve Two names
      ... I posted what you said about Andrew's wife as best I could from ... Andrew dealing with me, he does his usuall wimp routine and goes to MAtt. ... And if I bug YOU, two names, so be it. ... I also thank Berman for informing me about Shade ...
      (alt.vacation.las-vegas)
    • Re: launch applet with APPLET or PLUGIN
      ... > fails in my Firefox, but the next one, with the suspicious semicolon ... Now I have something to report to the Tomcat people. ... ...does seem like a bug. ... Andrew Thompson ...
      (comp.lang.java.programmer)