Re: x86-64 bad pmds in 2.6.11.6

From: Linus Torvalds (torvalds_at_osdl.org)
Date: 09/20/05

  • Next message: Andrew Morton: "Re: [RFC PATCH 4/10] vfs: global namespace semaphore"
    Date:	Tue, 20 Sep 2005 10:30:48 -0700 (PDT)
    To: Charles McCreary <mccreary@crmeng.com>
    
    

    On Tue, 20 Sep 2005, Charles McCreary wrote:
    >
    > Another datapoint for this thread. The box spewing the bad pmds messages is a
    > dual opteron 246 on a TYAN S2885 Thunder K8W motherboard. Kernel is
    > 2.6.11.4-20a-smp.

    This is quite possibly the result of an Opteron errata (tlb flush
    filtering is broken on SMP) that we worked around as of 2.6.14-rc4.

    So either just try 2.6.14-rc2, or try the appended patch (it has since
    been confirmed by many more people).

                    Linus

    ---
    diff-tree bc5e8fdfc622b03acf5ac974a1b8b26da6511c99 (from 61ffcafafb3d985e1ab8463be0187b421614775c)
    Author: Linus Torvalds <torvalds@g5.osdl.org>
    Date:   Sat Sep 17 15:41:04 2005 -0700
        x86-64/smp: fix random SIGSEGV issues
        
        They seem to have been due to AMD errata 63/122; the fix is to disable
        TLB flush filtering in SMP configurations.
        
        Confirmed to fix the problem by Andrew Walrond <andrew@walrond.org>
        
        [ Let's see if we'll have a better fix eventually, this is the Q&D
          "let's get this fixed and out there" version ]
        
        Signed-off-by: Linus Torvalds <torvalds@osdl.org>
    diff --git a/arch/x86_64/kernel/setup.c b/arch/x86_64/kernel/setup.c
    --- a/arch/x86_64/kernel/setup.c
    +++ b/arch/x86_64/kernel/setup.c
    @@ -831,11 +831,26 @@ static void __init amd_detect_cmp(struct
     #endif
     }
     
    +#define HWCR 0xc0010015
    +
     static int __init init_amd(struct cpuinfo_x86 *c)
     {
     	int r;
     	int level;
     
    +#ifdef CONFIG_SMP
    +	unsigned long value;
    +
    +	// Disable TLB flush filter by setting HWCR.FFDIS:
    +	// bit 6 of msr C001_0015
    +	//
    +	// Errata 63 for SH-B3 steppings
    +	// Errata 122 for all(?) steppings
    +	rdmsrl(HWCR, value);
    +	value |= 1 << 6;
    +	wrmsrl(HWCR, value);
    +#endif
    +
     	/* Bit 31 in normal CPUID used for nonstandard 3DNow ID;
     	   3DNow is IDd by bit 31 in extended CPUID (1*32+31) anyway */
     	clear_bit(0*32+31, &c->x86_capability);
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at  http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at  http://www.tux.org/lkml/
    

  • Next message: Andrew Morton: "Re: [RFC PATCH 4/10] vfs: global namespace semaphore"

    Relevant Pages

    • Re: deadlocks caused by ext3/reiser dirty_inode calls during do_mmap_pgoff
      ... The patch tries to fix both reiserfs and generic_file_write. ... int status; ... * returns zero on success, ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • [AMD64] Possible bug in fs/read_write.c::rw_verify_area
      ... is wrong at least on AMD64 architecture. ... This fix is not correct also in general: ... that size_t is a long int. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: dd PATCH: add conv=direct
      ... Thanks for the comment fix. ... /* Set the file descriptor flags for FD that correspond to the nonzero bits ... set_fd_flags (int fd, int add_flags, char const *name) ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • [PATCH] opl3sa2: MODULE_PARM_DESC
      ... Fix "irq"-parameter name typo for parameter description. ... module_param(irq, int, 0); ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • [git patches] net driver updates for .26
      ... Fix a bug where the pointer never moves for dma_unmap... ... Update and fix driver debugging messages ... int reset); ... * header structure can be anywhere in the mcp. ...
      (Linux-Kernel)