Re: Need help with crash message.



Jean-David Beyer wrote:
The Natural Philosopher wrote:
Jean-David Beyer wrote:
David W. Hodgins wrote:
On Sat, 25 Oct 2008 09:02:19 -0400, Jean-David Beyer
<jeandavid8@xxxxxxxxxxx> wrote:

My machine crashed this morning. Before crashing, it seems to have
logged messages such as:

Oct 23 11:14:47 trillian kernel: EDAC MC0: CE
page 0x12ec2d, offset 0x0, grain 4096, syndrome 0x1042, row 4, channel
0, label "": e7xxx CE

Oct 25 03:35:44 trillian kernel: EDAC MC0: UE
page 0x3c8d3, offset 0x0, grain 4096, row 0, labels ":": e7xxx UE

I've never seen this before, but according to
/usr/src/linux/Documentation/edac.txt, the CE entries are memory
Correctable Errors, while UE are Uncorrectable Errors. Looks like you
have ecc memory modules, and errors are being detected.

I do have ECC memory modules.

Try reseating the memory modules, and then running memtest. You'll
probably have to replace the bad ram, so you may want to run memtest with
one module at a time, to figure out which is bad.

I would have to run two modules at a time. Some of them have run for 4.5 years with no trouble and some for about 2 years also with no trouble.

Thank you. I have eight 1-GByte memory modules, and I must run them in pairs. IIRC, memtest-86 tries to identify which modules are bad.


That sounds like the bunny.

I ran it for a full pass (all 8 modules): took about 8 hours, and all OK.
It then ran for another 5 1/2 modules OK, and I stopped it because I must run the system normally again tonight.

Remember, it may not actually be the memory. It could be something else on the bus corrupting ACCESS to the chips.

I suppose so, but nothing else is on the path between the memory modules and the MCH. I wonder which bus. I have the usual two IDE busses and four PCI-X busses (not sockets, busses). But the memory goes straight to the MCH chip and on to the processors. Everything else goes to the ICH3 chip or to two P64H2 chips (that drive the four PCI-X busses).

MCH?

Have you any 3rd party cards in there at all though?.

I admit straws are being clutched at here..I am thinking of a time when a slow VIDEO CAPTURE card would, provided that exactly the right memory address was being DMA'ed into from a FLOPPY DISK, corrupt two bytes of it, for example.

However its no consolation knowing what caused it, if e.g. the answer is 'new motherboard'..


What worries me are that e pages shown above are completely different.


.



Relevant Pages

  • Re: Gericom Notebook Overdrive II XXL Model 3100S (SiS630/800MHz PIII) Mem Upgrade
    ... If a BIOS upgrade is available, download it and see if it comes ... chips on the memory modules that you installed. ...
    (comp.sys.ibm.ps2.hardware)
  • Re: 4GB RAM, only 3 show up in WinXP SP2
    ... single and doubled sided memory modules, ... Having all 4 slots populated CPU will automatically get RAM down to ... >> per memory controller channel) of double-sided memory modules only. ...
    (microsoft.public.windowsxp.hardware)
  • Re: Help needed to troubleshoot
    ... tried running memtest after reseating the memory modules (perhaps ... running memtest which can take an age if ever to discover a memory ... 'cured' simply by reseating the module (suggesting it would be worth ... be memory modules since reliable memory is a key component for stable ...
    (uk.comp.homebuilt)
  • Re: AlphaServer 1200 memory (brief update)
    ... any other  present the system failed the memory test. ... I remember seeing some of the early 1200's come with memory modules ... The 4100 memory modules have some VLSI chips as well as memory ... SDRAM while the AS1200 uses unbuffered SDRAM, ...
    (comp.os.vms)
  • Re: MB upgrade costs
    ... All Titanium PowerBook G4 models work with "PC133 SO-DIMM" memory (the ... Add 128 MB in the second slot. ... will have to get rid of one of the memory modules in order to upgrade it ...
    (comp.sys.mac.system)