Re: [PATCH] oom killer (Core)

From: Andrea Arcangeli (andrea_at_suse.de)
Date: 12/04/04

  • Next message: Nicholas Papadakos: "RE: realtek r8169 + kernel 2.4.24 (openmosix)"
    Date:	Sat, 4 Dec 2004 17:43:53 +0100
    To: Voluspa <lista4@comhem.se>
    
    

    On Sat, Dec 04, 2004 at 01:42:54PM +0100, Voluspa wrote:
    >
    > On 2004-12-04 8:08:40 Andrea Arcangeli wrote:
    >
    > > You can try to put back a might_slee_if(wait), but if it deadlocks
    > with
    > > that change sure it's not a bug in my patch, it's instead a bug
    > > somewhere else that calls alloc_pages w/o GFP_ATOMIC. Ingo's
    > > lowlatency patch would expose the same bug too since they're aliasing
    > > the might_sleep to cond_resched.
    >
    > Putting it back doesn't alter the outcome - hanging. And the original
    > patch, (hope it was the right one) from:
    >
    > http://marc.theaimsgroup.com/?l=linux-kernel&m=110204117506557&w=2

    yes it's the right one ;)

    > root:loke:/usr/src/linux-2.6.9-oomkill# patch -Np1 -i ../oomkill.patch
    > patching file mm/oom_kill.c
    > patching file mm/page_alloc.c
    > Hunk #1 succeeded at 608 (offset -3 lines).
    > Hunk #3 succeeded at 681 (offset -3 lines).
    > patching file mm/swap_state.c
    > patching file mm/vmscan.c
    >
    > has been tried with the following variations. With and without
    > optimizing for size, with and without preempt, with and without kernel
    > boot params (cfq, lapic), cold and hot starts, and then I threw in a smp
    > compile for measure. All have the same behaviour:
    >
    > [...]
    > Checking 'hlt' instruction... OK.
    >
    > [10 minutes wait. Then a long callback trace
    > scrolls off the screen ending like Thomas']
    >
    > <0>Kernel panic - not syncing: Fatal exception in interrupt
    >
    > My toolchain (well, the whole software system) is quite contemporary
    > within the stable branches. Built from scratch with gcc-3.4.3, glibc-
    > 20041011 (nptl) and binutils-2.15.92.0.2
    >
    > No energy control, acpi-pm or whatever it's called, is used here. The
    > machine is extremely stable. Running with 100 percent utilization 24/7.
    >
    > Don't shoot the messenger ;)

    I trust you of course but I've absolutely no idea how can my patch ever
    change any code that runs at that point during boot. mm/oom_kill.c can
    be obviously ruled out. The changes in mm/swap_state.c (two printk in
    show_swap_cache_info) as well can be obviously ruled out. The change in
    mm/vmscan.c as well only makes a difference during an oom condition.

    This mean it has to be the change in mm/page_alloc.c that broke
    something. But even that should never run during boot (except for the
    cond_resched instead of might_sleep_if that you already tried to backout
    separately from the rest). There's simply not enough memory pressure at
    boot in order to recall try_to_free_pages and run the modified code.

    If try_to_free_pages is being recalled during boot them we've a problem
    somewhere else, it should never happen!

    Plus it works like a charm here.

    Can you send me your .config so that I will try to send you privately a
    kernel image built on my machine? (and before sending I'll try to boot
    it locally ;) My .config sure is happily running.

    Many thanks.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Nicholas Papadakos: "RE: realtek r8169 + kernel 2.4.24 (openmosix)"

    Relevant Pages

    • Re: SUCCESS Re: 2.6.0-test11-mm1
      ... >> patch that's responsible, but it'd take a month to find out anything ... :-/ At least part of the bug (galeon misplaces text) still ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: next-20090220: XFS, IMA: BUG: sleeping function called from invalid context at mm/slub.c:16
      ... I have applied the following patch from Mimi Zohar ... The kernel can not boot without it when IMA is enabled. ... BUG: sleeping function called from invalid context at mm/slub.c:1613 ... not a suitable fix for this bug. ...
      (Linux-Kernel)
    • Re: [PATCH] oom killer (Core)
      ... I tried again from scratch and the kernel is booting without your patch. ... Adding your patch with the same config still does not boot. ... I added some debug output and it calls __alloc_pages a couple of times. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: Laptops & CPU frequency
      ... > changes - that is, it is not calculated just at boot, but it is updated ... SpeedStep code back around 2.5.73-75 timeframe (which managed to manifest as ... (the final actual fix ended up being slightly different, but that was the bug). ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • [2.6 patch] i386: cleanup boot_cpu_logical_apicid variables
      ... This patch therefore removes the one in mpparse.c. ... /* Processor that is doing the boot up */ ... ver = m->mpc_apicver; ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)