Re: system keeps freezing once every 24 hours / random apps crashing



Mark v Wolher wrote:
> Alistair John Strachan wrote:
>
>>On Saturday 31 December 2005 00:42, Mark v Wolher wrote:
>>
>>
>>>Alistair John Strachan wrote:
>>>
>>>
>>>>On Saturday 31 December 2005 00:20, Mark v Wolher wrote:
>>>>[snip]
>>>>
>>>>
>>>>
>>>>>>This is good news -- you stand a better chance of achieving the
>>>>>>stability you require by eliminating variables. VMWare and NVIDIA are
>>>>>>useful softwares, and I would not deny that, but they are closed source
>>>>>>and thus any conflicts resulting from their use are not necessary LKML
>>>>>>material (however, if the interaction is generic and is as a result of
>>>>>>a kernel bug, then the maintainer would very much like to hear it).
>>>>>
>>>>>Okay, i have something interesting now, i only had the nvidia module
>>>>>loaded so my x-configuration starts up as usual. (not saying the nvidia
>>>>>module is flawless, i'm sure it still contains bugs)
>>>>>But here is the crash info, this time it was mozilla, i think this
>>>>>speaks more hehe :
>>>>>
>>>>>Dec 31 00:55:28 localhost kernel: mm/memory.c:106: bad pgd 061f0c08.
>>>>>Dec 31 00:55:28 localhost kernel: mm/memory.c:106: bad pgd 06b96000.
>>>>>Dec 31 00:55:28 localhost kernel: mm/memory.c:106: bad pgd 18000bf8.
>>>>>Dec 31 00:55:28 localhost kernel: ------------[ cut here ]------------
>>>>>Dec 31 00:55:28 localhost kernel: kernel BUG at mm/mmap.c:2214!
>>>>>Dec 31 00:55:28 localhost kernel: invalid operand: 0000 [#1]
>>>>>Dec 31 00:55:28 localhost kernel: SMP
>>>>>Dec 31 00:55:28 localhost kernel: Modules linked in: nvidia
>>>>
>>>>Steady and sure progress. Now, the trace below doesn't explicitly mention
>>>>any nvidia symbols, but this line must disappear before anybody will
>>>>bother to read your report.
>>>>
>>>>Remove the module. This does not mean unload, this means "never load in
>>>>the first place". Then reproduce the problem. If you are successful, send
>>>>a new email (not pinned to this thread) with a subject a la "kernel BUG
>>>>at mm/mmap.c:2214". State that the kernel is not tainted.
>>>>
>>>>At this point all you can do is wait. Good luck!
>>>
>>>Well, i guess i'll have to do that to be sure. But i must say that i did
>>>try the nv module and de-installed the nvidia binary module. It didn't
>>>matter, the system froze but didn't leave anything in the logs, this
>>>time it did. Doesn't that help at all ?
>>>
>>>I'll try again, put nv up and wait for a something to happen. If some
>>>one has in the meantime more advise or maybe even could check out of
>>>curiousity why it says kernel BUG i'd appreciate it ofcourse.
>>
>>
>>Probably upwards of 95% of BUGs in mm/ are due to defective memory in the
>>system running the kernel. However, since you claim to have run other OSes
>>successfully on this configuration, I did not suggest it.
>>
>>However, I would highly recommend running memtest86 at least twice on the
>>machine if you cannot track down the source of the problem.
>>
>>It is always worth eliminating hardware.
>>
>
>
> Indeed, i'm going soon to get some sleep but leave memtest86 running for
> the night and when i wake up then i'll see if something is reported.
> It's 2x256 pc2100 ECC memory. I also expect next week monday or tuesday
> new memory, which i can use to replace this memory and exclude that
> eitherway.
>
> Thanks !
>
>
>
>

g'morning !

the memtest86 went 40 times over the memory, no errors detected.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: discriminate single bit error hardware failure from slab corruption.
    ... the usual slab corruption message, ... is a kernel bug. ... down to bad memory, or other hardware failure. ... Please run memtest86. ...
    (Linux-Kernel)
  • Re: computer stops responding and shuts down
    ... I already made it by using the function "Burn image to disc" of Nero and ran ... faulty or mismatched memory or try changing video ... Somebody please help me to figure out what problem my laptop has. ... of the screen, or memtest86+ could crash. ...
    (microsoft.public.windowsxp.hardware)
  • Re: solution Re: lost memory on a 4GB amd64
    ... Memtest is happy with my memory too, if all 4 modules are installed in the ... If I install 2 modules for each CPU, memtest86 is ... > Independantly of the memory settings in the BIOS: ...
    (Linux-Kernel)
  • Re: FC5 kernel hang with kernels 2122 and 2133
    ... The standard way to check memory is called memtest86. ... of the first things I did was boot memtest86 from a floppy and run it ... I filed a bug a couple of days after sending my email to the list. ...
    (Fedora)
  • Re: tests for bad hardware
    ... Every time I've run a program that is memory intensive, ... I did an extensive test with memtest86. ... The ram is in two ... the machine crashes early in the first iteration -- just ...
    (Ubuntu)