Re: Possible dcache BUG

From: R. J. Wysocki (rjw_at_sisk.pl)
Date: 08/22/04

  • Next message: Marc Ballarin: "Re: Obvious one-liner - Use 3DNOW on MK8"
    To: gene.heskett@verizon.net
    Date:	Sun, 22 Aug 2004 13:42:54 +0200
    
    

    On Sunday 22 of August 2004 07:05, Gene Heskett wrote:
    > On Friday 20 August 2004 16:17, Gene Heskett wrote:
    > >On Friday 20 August 2004 16:11, R. J. Wysocki wrote:[...]
    > >
    > >>There's a simple test you can do unless your DIMMs must go in pairs
    > >> (I don't remember if it's required by nforce2): remove one of them
    > >> and see what happens.
    > >
    > >To get dual channel DDR, they have to be in a pair. Since this
    > > post, they've been swapped one for the other, and I'll be curious
    > > to see if the address goes to an even address when it errors, which
    > > it hasn't yet.
    >
    > It has, one time in 35 hours now. The problem is considerably
    > reduced.
    >
    > Whereas the error was always at an odd address, and in the 2nd LSbyte,
    > now its still an odd address but the error has moved to the MSB of a
    > 32 bit fetch:
    >
    > [root@coyote memburn]# ./memburn 512
    > Starting test with size 512 megs..
    > Passed round 2308, elapsed 41225.98.
    > FAILED at round 2309/40220063: got ff000000, expected 00000000!!!
    > REREAD: ff000000, ff000000, ff000000!!!
    > [root@coyote memburn]# ./memburn 512
    > Starting test with size 512 megs..
    > Passed round 2636, elapsed 60944.15.
    >
    > As can be seen, I restarted it, and its ran quite even more loops now
    > without error. There has been no more Oops, but with memburn eating
    > 512 megs, half my ram, and kde-3.3 under construction by konstruct,
    > I've peaked at nearly a gig of swap, and 754 megs in swap right now.
    > Sure, its a bit laggy, but not unusable.
    >
    > So now the question is since the error address is always odd, which
    > stick is it?

    Hard to tell. I think the memory controller is interleaving them for
    efficiency but the question remains which one is regarded as the first.

    BTW, as it indicates that DRAM is to blame, you can try to fiddle a bit with
    its timings (provided the board setup allows you to do this). For example,
    you can set them to 3-3-3 or equivalent (generally, push them up) and check
    if this affects the memburn results and how. Just an idea, you know. ;-)

    Greetings,

    -- 
    Rafael J. Wysocki
    ----------------------------
    For a successful technology, reality must take precedence over public 
    relations, for nature cannot be fooled.
    					-- Richard P. Feynman
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at  http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at  http://www.tux.org/lkml/
    

  • Next message: Marc Ballarin: "Re: Obvious one-liner - Use 3DNOW on MK8"

    Relevant Pages

    • Re: swappiness=0 makes software suspend fail.
      ... software suspend wrecks your swap partition if you suspend to swap but ... Seems odd. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: Q about pagecache data never written to disk
      ... The calling convention looks very very odd also; ... one of the results whenever there are multiple concurrent callers of it ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: new dev model (was Re: Default cache_hot_time value back to 10ms)
      ... Mainline is suffering too - lots of people I know running 2.6 on production ... systems have noted a marked increase in problems, crashes, odd things. ... I can try to gather the general reports I hear from people - it might well ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: swsusp: revert to 2.6.0-test3 state
      ... > of how kernel development is to be accomplished. ... Patrick decided 6 is odd for him and Linus failed to stop that :-(. ... ...vgf orggre jura vgf serr. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: 2.6.0-test2-mm5
      ... > Odd, odd. ... Attached are the full boot messages before the crash plus lspci -vvv ... send the line "unsubscribe linux-kernel" in ... More majordomo info at http://vger.kernel.org/majordomo-info.html ...
      (Linux-Kernel)