Re: panic, what is the hardware in error ?



On Mon, 27 Nov 2006 10:17:44 +0100, mobidyc wrote this:

"Steve Wolfe" <anx@xxxxxxxxx> a écrit dans le message de
news:KcmdnYF31LVEYvXYnZ2dnUVZ_rGdnZ2d@xxxxxxxxxxxxxx
this last days, my linux computer panics (four times in three days)
;(

the last things i see in the console are: #########################
Message from syslogd@mobidyc at Fri Nov 24 13:26:09 2006 ... mobidyc
kernel: CPU 0: Machine Check Exception: 0000000000000004

Message from syslogd@mobidyc at Fri Nov 24 13:26:09 2006 ... mobidyc
kernel: Bank 1: b600000000000181 at 0000000034ce0340

Message from syslogd@mobidyc at Fri Nov 24 13:26:09 2006 ... mobidyc
kernel: Bank 2: 940040000000017a at 0000000002903280
#########################

is it a problem with the CPU, the Memory or the Motheboard ?

I started getting those same errors recently on a dual-Opteron, one
of the
sticks of RAM was going bad. Looking in the BIOS' log also showed
single-
and double-bit memory errors. I pulled out the memory attached to the
processer in question, and all of the problems stopped.

steve



hi,

i think that the problem is the motherboard, i've tested with one stick of
ram on the three slots with no problem. i've tested the other stick with
no problem.

when i put the two sticks (tested on all slots), my linx falls... ;(

i don't know if i still have a guarantee for this motherboard but i'm
going to contact asus nearly.

thanks all.

Back up critical data before system failure. IMO, it's memory.

The problem sounds like both sticks will not work together. Are both
sticks of memory exactly the same? Memory from different vendors
sometimes don't play well together, especially value ram versions.
A simple solution is to run with one stick, system will be slower.

Run Memtest86 on both sticks of memory together for at least an hour.

I suspect faulty memory or psu before defective motherboard or cpu
failure. Temps do not look excessive, AMD XPs run up to 70C.

.



Relevant Pages

  • Oops in 2.6.28-rc9 and -rc8 -- mtrr issues / e1000e
    ... Bios 1.04beta did show correct memory sizing in dmidecode, ... I hope this is as simple as me doing something glaringly wrong in the kernel ... DMI present. ... CPU: L2 cache: 6144K ...
    (Linux-Kernel)
  • Re: Threads and processes on Linux
    ... > task runs in the same memory space as the parent task. ... > LinuxThreads and/or older 2.4.x kernel. ... CPU, and so a lot of time is being spent loading the memory accessed by the ... I only got thread affinity - I didn't get interrupt affinity. ...
    (comp.os.linux.misc)
  • Re: 2.6.28-rc9 panics with crashkernel=256M while booting
    ... Allocated 00d00000 bytes for kernel @ 02d00000 ... Device tree strings 0x0000000003a90000 -> 0x0000000003a917bc ... CPU maps initialized for 2 threads per core ... Node 0 Memory: ...
    (Linux-Kernel)
  • Re: [PATCH] Remove process freezer from suspend to RAM pathway
    ... Atomically sends SIGSTOP to all userspace processes in a non-trappable way, except the calling process and any process which is ptracing it. ... I don't think it matters whether it's userspace or kernel that does the suspending and I'm yet to see a good reason for it to be done from userspace. ... You don't actually care if its sleeping in the kernel somewhere, just as long as it doesn't allocate much memory. ... One CPU turns off all interrupts on itself and takes an atomic snapshot of kernel memory into the previously allocated storage. ...
    (Linux-Kernel)
  • [PATCH 2/20] FRV: Fujitsu FR-V arch documentation
    ... FR-V CPU arch. ... Atomic subtract from memory ... The CCCR.CC3 register is reserved within the kernel to act as an atomic modify abort flag. ... SDRAM controller registers. ...
    (Linux-Kernel)