Freeze on 2.4 kernels.

From: Thierry Coutelier (Thierry.Coutelier_at_linux.lu)
Date: 09/17/04

  • Next message: Jeff Garzik: "Re: SATA Sil 3114 Hotplug"
    Date:	Fri, 17 Sep 2004 16:02:58 +0200
    To: linux-kernel@vger.kernel.org
    
    

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    Every few weeks (sometimes 2 days, often 3 weeks and sometime up to 9
    weeks) our kernel freezes: nothing on screen or serial console except
    from some VJ decompression errors which we have at all times, even the
    Num-Lock does not respond. We tried to enable sysreq keys but those
    won't work either.

    We are using Linux boxes to offer Satellite Internet.
    We still use RedHat 7.[23] and 2.4 kernels.

    The system works using rp-l2tp and/or pptpd with pppd.
    On the outgoing interface (the one that sends traffic to the
    Satellite we were using CBQ and now we use HTB queuing discipline.

    The kernels range from 2.4.6 to 2.4.25 with some modifications
    (tcp_input). We tried with the standard kernel with the only
    change that the dev_alloc_name has been changed to support
    up to 900 names.

    The Hardware are Dell PowerEdge with Perc2 or Perc3. We tried with HP
    servers and have the same problem. We tried different firmware releases
    for the Perc cards and still no change.

    The NIC cards are mostly Intel EEpro 100. We tried with both drivers
    Intel and community with no better results.

    The problem may be happening more often (every 2/3 days) when we
    simulate a lot of ppp connections/disconnections (80 users/minute),
    but in some cases it hangs even without having many users.

    The platform we run have between 25 to 200 simultaneous connections.
    Some have single or dual or even quad CPU's. And RAM between 512Mbytes
    and 4 Gbytes.

    We could not detect any parameters that would rise before the freeze
    (load, memory, swap ...)

    Could anyone give me some hint as to what to do/test more ?
    Where could the problem be ?

    Here are some stack traces that we where able to get on a serial console:

    Unable to handle kernel paging request at virtual address 5f47534d
    ~ printing eip:
    c0119923
    *pde = 00000000
    Oops: 0000
    CPU: 1
    EIP: 0010:[<c0119923>] Not tainted
    EFLAGS: 00010046
    eax: 00000000 ebx: 00000000 ecx: 00000001 edx: 5f47534d
    esi: dfff6000 edi: 00000001 ebp: d0939fbc esp: d0939f90
    ds: 0018 es: 0018 ss: 0018
    Process ip-up (pid: 27133, stackpage=d0939000)
    Stack: 00000000 c0306438 00000000 5f47534d 00000000 d0938000 fffffc18
    c0374ce0
    ~ d0938000 00000001 00000006 bfffe148 c010932d 00000000 082a73a4
    082a71b8
    ~ 00000001 00000006 bfffe148 082a71c4 0000002b 0000002b ffffff00
    08096305
    Call Trace: [<c010932d>]

    Code: 8b 02 89 45 e0 0f 18 00 81 fa 20 6e 30 c0 0f 85 79 ff ff ff

    After ksymoops analysis :

    ksymoops 2.4.1 on i686 2.4.25. Options used
    ~ -V (default)
    ~ -k /proc/ksyms (default)
    ~ -l /proc/modules (default)
    ~ -o /lib/modules/2.4.25/ (default)
    ~ -m /boot/System.map-2.4.25 (default)

    Warning: You did not tell me where to find symbol information. I will
    assume that the log matches the kernel and modules that are running
    right now and I'll use the default options above for symbol resolution.
    If the current kernel and/or modules do not match the log, you can get
    more accurate output by telling me the kernel version and where to find
    map, modules, ksyms etc. ksymoops -h explains the options.

    Warning (compare_maps): ip_conntrack symbol
    GPLONLY_ip_conntrack_expect_find_get not found in /lib/modules/2.4.25
    /kernel/net/ipv4/netfilter/ip_conntrack.o. Ignoring /lib/modules/2.4.25
    /kernel/net/ipv4/netfilter/ip_conntrack.o entry
    Warning (compare_maps): ip_conntrack symbol GPLONLY_ip_conntrack_expect_put
    not found in /lib/modules/2.4.25/kernel/net/ipv4/netfilter/ip_conntrack.o.
    Ignoring /lib/modules/2.4.25/kernel/net/ipv4/netfilter/ip_conntrack.o entry
    Warning (compare_maps): ip_conntrack symbol GPLONLY_ip_conntrack_find_get
    not found in /lib/modules/2.4.25/kernel/net/ipv4/netfilter/ip_conntrack.o.
    Ignoring /lib/modules/2.4.25/kernel/net/ipv4/netfilter/ip_conntrack.o entry
    Warning (compare_maps): ip_conntrack symbol GPLONLY_ip_conntrack_put not
    found in /lib/modules/2.4.25/kernel/net/ipv4/netfilter/ip_conntrack.o.
    Ignoring /lib/modules/2.4.25/kernel/net/ipv4/netfilter/ip_conntrack.o entry
    Warning (compare_maps): mismatch on symbol ip_conntrack_destroyed ,
    ip_conntrack says e0ed4b78, /lib/modules/2.4.25
    /kernel/net/ipv4/netfilter/ip_conntrack.o says e0ed42e4. Ignoring
    /lib/modules/2.4.25/kernel/net/ipv4/netfilter/ip_conntrack.o entry
    Warning (compare_maps): mismatch on symbol ip_conntrack_hash ,
    ip_conntrack says e0ed4b90, /lib/modules/2.4.25
    /kernel/net/ipv4/netfilter/ip_conntrack.o says e0ed42fc. Ignoring
    /lib/modules/2.4.25/kernel/net/ipv4/netfilter/ip_conntrack.o entry
    Warning (compare_maps): mismatch on symbol ip_conntrack_htable_size ,
    ip_conntrack says e0ed4b7c, /lib/modules/2.4.25
    /kernel/net/ipv4/netfilter/ip_conntrack.o says e0ed42e8. Ignoring
    /lib/modules/2.4.25/kernel/net/ipv4/netfilter/ip_conntrack.o entry
    Warning (compare_maps): mismatch on symbol usb_devfs_handle , usbcore says
    e0d4e274, /lib/modules/2.4.25/kernel/drivers/usb/usbcore.o says e0d4dcd4.
    Ignoring /lib/modules/2.4.25/kernel/drivers/usb/usbcore.o entry
    Unable to handle kernel paging request at virtual address 5f47534d
    c0119923
    *pde = 00000000
    Oops: 0000
    CPU: 1
    EIP: 0010:[<c0119923>] Not tainted
    Using defaults from ksymoops -t elf32-i386 -a i386
    EFLAGS: 00010046
    eax: 00000000 ebx: 00000000 ecx: 00000001 edx: 5f47534d
    esi: dfff6000 edi: 00000001 ebp: d0939fbc esp: d0939f90
    ds: 0018 es: 0018 ss: 0018
    Process ip-up (pid: 27133, stackpage=d0939000)
    Stack: 00000000 c0306438 00000000 5f47534d 00000000 d0938000 fffffc18
    c0374ce0
    ~ d0938000 00000001 00000006 bfffe148 c010932d 00000000 082a73a4
    082a71b8
    ~ 00000001 00000006 bfffe148 082a71c4 0000002b 0000002b ffffff00
    08096305
    Call Trace: [<c010932d>]
    Code: 8b 02 89 45 e0 0f 18 00 81 fa 20 6e 30 c0 0f 85 79 ff ff ff

    |>>>EIP; c0119923 <schedule+173/4c0> <=====

    Trace; c010932d <reschedule+5/c>
    Code; c0119923 <schedule+173/4c0>
    00000000 <_EIP>:
    Code; c0119923 <schedule+173/4c0> <=====
    ~ 0: 8b 02 mov (%edx),%eax <=====
    Code; c0119925 <schedule+175/4c0>
    ~ 2: 89 45 e0 mov %eax,0xffffffe0(%ebp)
    Code; c0119928 <schedule+178/4c0>
    ~ 5: 0f 18 00 prefetchnta (%eax)
    Code; c011992b <schedule+17b/4c0>
    ~ 8: 81 fa 20 6e 30 c0 cmp $0xc0306e20,%edx
    Code; c0119931 <schedule+181/4c0>
    ~ e: 0f 85 79 ff ff ff jne ffffff8d <_EIP+0xffffff8d>
    c01198b0 <schedule+100/4c0>

    9 warnings issued. Results may not be reliable.

    - ----

    ksymoops 2.4.1 on i686 2.4.25-SES. Options used
    ~ -V (default)
    ~ -k /proc/ksyms (default)
    ~ -l /proc/modules (default)
    ~ -o /lib/modules/2.4.25-SES/ (default)
    ~ -m /boot/System.map-2.4.25-SES (default)

    Warning: You did not tell me where to find symbol information. I will
    assume that the log matches the kernel and modules that are running
    right now and I'll use the default options above for symbol resolution.
    If the current kernel and/or modules do not match the log, you can get
    more accurate output by telling me the kernel version and where to find
    map, modules, ksyms etc. ksymoops -h explains the options.

    Warning (compare_maps): ip_conntrack symbol
    GPLONLY_ip_conntrack_expect_find_get not found in
    /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o. Ignoring
    /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o entry
    Warning (compare_maps): ip_conntrack symbol GPLONLY_ip_conntrack_expect_put
    not found in
    /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o. Ignoring
    /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o entry
    Warning (compare_maps): ip_conntrack symbol GPLONLY_ip_conntrack_find_get
    not found in
    /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o. Ignoring
    /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o entry
    Warning (compare_maps): ip_conntrack symbol GPLONLY_ip_conntrack_put not
    found in /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o.
    Ignoring /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o
    entry
    Warning (compare_maps): mismatch on symbol ip_conntrack_destroyed ,
    ip_conntrack says e0edab98,
    /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o says
    e0eda304. Ignoring
    /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o entry
    Warning (compare_maps): mismatch on symbol ip_conntrack_hash ,
    ip_conntrack says e0edabb0,
    /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o says
    e0eda31c. Ignoring
    /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o entry
    Warning (compare_maps): mismatch on symbol ip_conntrack_htable_size ,
    ip_conntrack says e0edab9c,
    /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o says
    e0eda308. Ignoring
    /lib/modules/2.4.25-SES/kernel/net/ipv4/netfilter/ip_conntrack.o entry
    Warning (compare_maps): mismatch on symbol my_classid , sch_miq says
    e0ed1a20, /lib/modules/2.4.25-SES/kernel/net/sched/sch_miq.o says e0ed19a0.
    Ignoring /lib/modules/2.4.25-SES/kernel/net/sched/sch_miq.o entry
    Warning (compare_maps): mismatch on symbol usb_devfs_handle , usbcore says
    e0d4e294, /lib/modules/2.4.25-SES/kernel/drivers/usb/usbcore.o says
    e0d4dcf4. Ignoring /lib/modules/2.4.25-SES/kernel/drivers/usb/usbcore.o
    entry
    Unable to handle kernel NULL pointer dereference at virtual address
    00000000
    c0119923
    *pde = 00000000
    Oops: 0000
    CPU: 1
    EIP: 0010:[<c0119923>] Tainted: P
    Using defaults from ksymoops -t elf32-i386 -a i386
    EFLAGS: 00010046
    eax: 00000000 ebx: 083ca488 ecx: 00000001 edx: 00000000
    esi: dfff6000 edi: 00000001 ebp: cc8a1f98 esp: cc8a1f6c
    ds: 0018 es: 0018 ss: 0018
    Process tbectrld (pid: 21783, stackpage=cc8a1000)
    Stack: 00000082 cb1b44c0 cc8a0000 00000000 c01209dc cc8a0000 fffffc18
    c0376ce0
    ~ c160c220 dfffb200 cc8a0000 00000000 c0120e4d cc8a0000 c160c220
    cc8a0000
    ~ 40170c44 00000000 bffffcd8 c0120fc3 00000000 c010927f 00000000
    00001000
    Call Trace: [<c01209dc>] [<c0120e4d>] [<c0120fc3>] [<c010927f>]
    Code: 8b 02 89 45 e0 0f 18 00 81 fa a0 7f 30 c0 0f 85 79 ff ff ff

    |>>>EIP; c0119923 <schedule+173/4c0> <=====

    Trace; c01209dc <exit_notify+dc/360>
    Trace; c0120e4d <do_exit+1ed/330>
    Trace; c0120fc3 <sys_exit+13/20>
    Trace; c010927f <system_call+33/38>
    Code; c0119923 <schedule+173/4c0>
    00000000 <_EIP>:
    Code; c0119923 <schedule+173/4c0> <=====
    ~ 0: 8b 02 mov (%edx),%eax <=====
    Code; c0119925 <schedule+175/4c0>
    ~ 2: 89 45 e0 mov %eax,0xffffffe0(%ebp)
    Code; c0119928 <schedule+178/4c0>
    ~ 5: 0f 18 00 prefetchnta (%eax)
    Code; c011992b <schedule+17b/4c0>
    ~ 8: 81 fa a0 7f 30 c0 cmp $0xc0307fa0,%edx
    Code; c0119931 <schedule+181/4c0>
    ~ e: 0f 85 79 ff ff ff jne ffffff8d <_EIP+0xffffff8d>
    c01198b0 <schedule+100/4c0>

    10 warnings issued. Results may not be reliable.

    - --
    Thierry Coutelier Président LiLux asbl
    7, Rue Jacques Sturm L-2556 Luxembourg
    Office:+352 710725 608 Home:+352 406776
    http://www.lilux.lu/

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.4 (GNU/Linux)
    Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

    iD8DBQFBSu6SPOfrcNNQX7oRAge+AJ9fdYdf0/AxEbDdd/LGaJBs0BU28wCfX9ja
    FCZr3mX4ox6hHinkcRMXUy8=
    =q7bP
    -----END PGP SIGNATURE-----
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Jeff Garzik: "Re: SATA Sil 3114 Hotplug"

    Relevant Pages

    • Re: 2.4.22-pre lockups (now decoded oops for pre10)
      ... Warning: You did not tell me where to find symbol information. ... assume that the log matches the kernel and modules that are running ... If the current kernel and/or modules do not match the log, ...
      (Linux-Kernel)
    • Possible Kernel Bug in tunnel interface and iptables in 2.4.22
      ... Our database slaves panic with a kernel oops when receiving packets via ... Kernel config, ksymoops ... Warning: You did not tell me where to find symbol information. ... # Loadable module support ...
      (Linux-Kernel)
    • 2.4.x oops with X
      ... I'm getting oopses with kernel 2.4.x (kernel ... Warning: You did not tell me where to find symbol information. ... mismatch on symbol vg, lvm-mod says e0905680, ... # Software Suspend support ...
      (Linux-Kernel)
    • bug: hid module crashed
      ... Warning: You did not tell me where to find symbol information. ... assume that the log matches the kernel and modules that are running ... If the current kernel and/or modules do not match the log, ... Trace; f899c23e ...
      (Linux-Kernel)
    • kernel-2.6.0-test9 compile issue
      ... In the 2.4.xx series kernel, ... aicasm_symbol.c:63: warning: data definition has no type or storage class ... aicasm_gram.tab.c:3055: warning: previous implicit declaration of `yyerror' ...
      (Debian-User)