Re: OOM killer "Out of Memory: Killed process" SOLUTIONS / SUMMARY



2007/8/10, Eric Sisler <esisler@xxxxxxxxxxxxxxxxxxxxx>:
Since this problem seems to popup on different lists, this message has
been cross-posted to the general Red Hat discussion list, the RHEL3
(Taroon) list and the RHEL4 (Nahant) list. My apologies for not having
the time to post this summary sooner.

I would still be banging my head against this problem were it not for
the generous assistance of Tom Sightler <ttsig@xxxxxxxxxxxxx> and Brian
Long <brilong@xxxxxxxxx>.

In general, the out of memory killer (oom-killer) begins killing
processes, even on servers with large amounts (6Gb+) of RAM. In many
cases people report plenty of "free" RAM and are perplexed as to why the
oom-killer is whacking processes. Indications that this has happened
appear in /var/log/messages:
Out of Memory: Killed process [PID] [process name].

The fact of having large amounts of memory is important? I mean, this
can happen either with 2GB or 10GB?
It´s just curiosity, I have never ever faced this problem. I found
this topic really interesting, though


In my case I was upgrading various VMware servers from RHEL3 / VMware
GSX to RHEL4 / VMware Server. One of the virtual machines on a server
with 16Gb of RAM kept getting whacked by the oom-killer. Needless to
say, this was quite frustrating.

As it turns out, the problem was low memory exhaustion. Quoting Tom:
"The kernel uses low memory to track allocations of all memory thus a
system with 16GB of memory will use significantly more low memory than a
system with 4GB, perhaps as much as 4 times. This extra pressure
happens from the moment you turn the system on before you do anything at
all because the kernel structures have to be sized for the potential of
tracking allocations in four times as much memory."

You can check the status of low & high memory a couple of ways:

# egrep 'High|Low' /proc/meminfo
HighTotal: 5111780 kB
HighFree: 1172 kB
LowTotal: 795688 kB
LowFree: 16788 kB

# free -lm
total used free shared buffers cached
Mem: 5769 5751 17 0 8 5267
Low: 777 760 16 0 0 0
High: 4991 4990 1 0 0 0
-/+ buffers/cache: 475 5293
Swap: 4773 0 4773

When low memory is exhausted, it doesn't matter how much high memory is
available, the oom-killer will begin whacking processes to keep the
server alive.

There are a couple of solutions to this problem:

If possible, upgrade to 64-bit Linux. This is the best solution because
*all* memory becomes low memory. If you run out of low memory in this
case, then you're *really* out of memory. ;-)

If limited to 32-bit Linux, the best solution is to run the hugemem
kernel. This kernel splits low/high memory differently, and in most
cases should provide enough low memory to map high memory. In most
cases this is an easy fix - simply install the hugemem kernel RPM &
reboot.

Does hugemen act as a module or...? How can it expand the low memory?


If running the 32-bit hugemem kernel isn't an option either, you can try
setting /proc/sys/vm/lower_zone_protection to a value of 250 or more.
This will cause the kernel to try to be more aggressive in defending the
low zone from allocating memory that could potentially be allocated in
the high memory zone. As far as I know, this option isn't available
until the 2.6.x kernel. Some experimentation to find the best setting
for your environment will probably be necessary. You can check & set
this value on the fly via:
# cat /proc/sys/vm/lower_zone_protection
# echo "250" > /proc/sys/vm/lower_zone_protection

To set this option on boot, add the following to /etc/sysctl.conf:
vm.lower_zone_protection = 250

If the first solution, your point was to upgrade to 64-bit. And as you
wrote, if you even run out of low memory...pray.
What if you do the vm.lower_zone_protection = 250 ? Should it give you
some more "extra time" before the disaster?


As a last-ditch effort, you can disable the oom-killer. This option can
cause the server to hang, so use it with extreme caution (and at your
own risk)!
Check status of oom-killer:
# cat /proc/sys/vm/oom-kill

Turn oom-killer off/on:
# echo "0" > /proc/sys/vm/oom-kill
# echo "1" > /proc/sys/vm/oom-kill

To make this change take effect at boot time, add the following
to /etc/sysctl.conf:
vm.oom-kill = 0

For processes that would have been killed, but weren't because the oom-
killer is disabled, you'll see the following message
in /var/log/messages:
"Would have oom-killed but /proc/sys/vm/oom-kill is disabled"

Sorry for being so long-winded. I hope this helps others who have
struggled with this problem.


Really interesting post, Eric.
Eventually, what did you do? Upgrade? Disable oom-killer? Pray? Delete
VMWare server? :-)

All the best.
Manuel

--
redhat-list mailing list
unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list



Relevant Pages

  • Re: Whats an oom-killer when its at home?
    ... Anyway the kernel is between a rock and a hard place when it decides to deploy the OOMK, the other choice would be to abend. ... Fedora docs say that oom-killer should pick on big niced programs first, which would make Malaria-control its prime target. ... The program is capable of oversubscribing memory all by itself: it has successfully run to completion despite expanding its heap so it occupies up to 400 MB of virtual memory despite my box only having 256 MB RAM and 1 GB of swap space. ...
    (uk.comp.os.linux)
  • Re: Out of Memory issue
    ... maybe this is because i am running 2.4.21.20 kernel. ... the out of memory killer (oom-killer) begins killing ... the problem was low memory exhaustion. ...
    (RedHat)
  • OOM killer "Out of Memory: Killed process" SOLUTIONS / SUMMARY
    ... Since this problem seems to popup on different lists, ... the problem was low memory exhaustion. ... "The kernel uses low memory to track allocations of all memory thus a ...
    (RedHat)
  • Re: Out of Memory issue
    ... Below is my kernel version and the text from the messages ... the out of memory killer (oom-killer) begins killing ... the problem was low memory exhaustion. ...
    (RedHat)
  • Re: [RFC] : mm : / Patch / code : Suggestion :snip kswapd &get_page_from_freelist() : No more no
    ... watermarks after allocating a page. ... When we're in the watermark range, we'll wake up kswapd ... So, before the change, with high memory consumption/pressure, ... amount of time due to the simple fact of low memory and/or ...
    (Linux-Kernel)