Re: kernel hangs quickly - non-root user reproducible



On Sun, 13 Jul 2008 08:26:42 +0200 Lennart Benschop <lennartb@xxxxxxxxx> wrote:
| phil-news-nospam@xxxxxxxx wrote:
|
|> I've been able to quickly (sometimes instantly) cause the kernel to hang
|> or
|> crash via a shell command by a non-root user. The magic command is LONG:
|
|> other than
|> for this one bug) and to i686 type. It's almost as if there is a hardware
|> bug. But why would it be triggered only when it has an apparent out of
|> memory condition?
| Did you try to run memtest86? It is a memory test program that can be
| run at boot time and it is available on most Linux CDs. Memory
| problems sometimes show up only under specific heavy load conditions
| (typically large compile jobs) and memtest86 is often able to find
| them.

Yes, I ran the one from the Fedora CD and it showed no errors.

What is peculiar here is that there really is no load, per se. This happens
even if the machine was just rebooted. The first thing I do is login and run
the killer command, and it kills the kernel. If I don't run that command,
the machine can run just about forever, even with quite heavy loads such as
compiling the kernel over and over in a loop. Yet if a process does all the
brk() calls to demand an excess of memory (that the kernel should at some
point just refuse to give it), it dies. Based on the few times I do get any
kernel messages, it looks like somewhere in the code to handle an out of
memory condition, there's a stray branch to 0. Sometimes what it executes
at location 0 causes a trap for a bad instruction within a few instruction
steps before anything worse. Sometimes whatever is there will just hose the
machine (loaded a goofy register value that makes something go into a loop,
for example). That's what it seems like. If it were bad memory, shouldn't
I be seeing this problem for other kinds of heavy loads other than out of
memory?

--
|WARNING: Due to extreme spam, googlegroups.com is blocked. Due to ignorance |
| by the abuse department, bellsouth.net is blocked. If you post to |
| Usenet from these places, find another Usenet provider ASAP. |
| Phil Howard KA9WGN (email for humans: first name in lower case at ipal.net) |
.



Relevant Pages

  • Re: [SLE] Problems with USB Memory stick
    ... a memory stick in the reader before starting the box. ... > Protocol: Transparent SCSI ... However, whenever I try and mount the stick, the command ... > The system is running the kernel that came with the distro. ...
    (SuSE)
  • Re: kernel hangs quickly - non-root user reproducible
    ... The magic command is LONG: ... | In all cases on this quad core machine, the kernel always hangs or crashes. ... | But why would it be triggered only when it has an apparent out of memory ... But this time it gave some kernel BUG messages (as it sometimes does in the ...
    (comp.os.linux.development.system)
  • Re: kernel hangs quickly - non-root user reproducible
    ... |> crash via a shell command by a non-root user. ... But the Linux kernel does not crash, ... BASH eat up all available memory, and the kernel runs into problems just ...
    (comp.os.linux.development.system)
  • Re: [RFC][PATCH 0/2 -mm] kexec based hibernation
    ... The hibernation image size can exceed half of memory size easily. ... the kernel code needed is ... from a different location depending on command line parameter. ...
    (Linux-Kernel)
  • 2.6.33 dies on modprobe
    ... The kernel starts up fine and mounts the root-filesystem, ... Freeing unused kernel memory: 408k freed ... # CPUFreq processor drivers ...
    (Linux-Kernel)