Re: Consistent kernel hang during heavy TCP connection handling load

From: Daniel Stekloff (dsteklof_at_us.ibm.com)
Date: 09/30/04

  • Next message: Franz Pletz: "Re: [PATCH 2.6] Natsemi - remove compilation warnings"
    To: Bill Davidsen <davidsen@tmr.com>, aathan-linux-kernel-1542@cloakmail.com
    Date:	Thu, 30 Sep 2004 11:18:14 -0700
    
    

    On Thu, 2004-09-30 at 09:34, Bill Davidsen wrote:
    > Andrew A. wrote:
    > > Jan,
    > >
    > > Thanks for responding. When I got no responses, I searched for ways to get more data out of the kernel--I must say that it has been
    > > quite a journey to identify what is working, where to get it, and how to install it when it comes to kernel
    > > debugging/crash-data-gathering tools. LKCD for example, is not available at the location you'll eventually arrive at if you search
    > > for it in google ... it's not obvious what it's state is (current/defunct/superceded), there's KDB, KGDB, netdump, netconsol,
    > > netlog, diskdump (conusingly known as lkdump) etc. etc. And then, even if you do figure out what tools are current, you then have
    > > to match the tool to the particular kernel version you are running -- which can be a task and a half unto itself.
    > >
    > > Is diskdump available for 2.4? Can anyone comment on the choice of tools below?
    > >
    > > Anyway, I have also done all of the following:
    > >
    > > (1) Enabled netdump/netconsole on 2.6.8.1-521 Fedora Core kernel, after first fixing the startup scripts. Fixes can be found at
    > > www.memeplex.com/Linux.html Note that after I also fixed crash.c to be a 2.6 compliant kernel module, and loading it to test
    > > netdump, I always end up with a vmcore-incomplete image approx 45k in size, on the netdump-server. Can anyone tell me if this is
    > > absurdly small, and if so, what might be the solution? The client box always reboots so I suspect too-small timeouts are the issue.
    >
    > My experience is 100% with RH kernels, but the dump should be about
    > memory size, in my case 2.5G or 4G and it is. But I did see hangs which
    > resulted in the size you mention, a few k and hang.

    Yep, the vmcore should be around the size of the memory on the dumping
    system. The size is too small and vmcore-incomplete wasn't made into
    vmcore, so it's incomplete.

    Do you have access to making a new kernel? Here's a fix that I think
    will help. This will speed things up and help with the timeout issues.

    drivers/net/netdump.c :

    1) Initialize jiffy_cyles to 1000 * (1000000/HZ) -

    -static unsigned long long t0, jiffy_cycles;
    +static unsigned long long t0, jiffy_cycles = 1000 * (1000000/HZ);

    2) Change prev_jiffies from an "int" to an "unsigned long" in
    print_status() function -

    - static int prev_jiffies = 0;
    + static unsigned long prev_jiffies = 0;

    Let me know if this helps,

    Thanks,

    Dan

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Franz Pletz: "Re: [PATCH 2.6] Natsemi - remove compilation warnings"

    Relevant Pages