Re: GART error 11 (fwd)

From: Andi Kleen (ak_at_muc.de)
Date: 05/27/04

  • Next message: Thomas Zehetbauer: "AMD64: IDE performance woes"
    Date:	27 May 2004 19:13:52 +0200
    Date:	Thu, 27 May 2004 19:13:52 +0200
    To: Arthur Perry <kernel@linuxfarms.com>
    
    

    On Thu, May 27, 2004 at 12:05:38PM -0400, Arthur Perry wrote:
    > And perhaps this may be the case, maybe the hardware should not report
    > these errors (which may not actually be gart errors after all) just
    > because the GART has been set up.
    > However, my failure mode seems to be that I only get these errors when the
    > agp driver is loaded on a machine that does not have an agp bus.
    > I also have IOMMUs disabled in the BIOS by default.

    The kernel will allocate an aperture if there isn't one (over memory
    when needed)

    > The BIOS is not enabling the GART at all, so it must be done by the
    > kernel. A boot into DOS will show the Gart Aperture Control Register set

    Correct.

    > to all zeros, where a boot to Linux 2.4 w/AGP will boot with them enabled.
    > Again, the failure mode recognised so far is that the "gart errors" appear
    > when this register is set up.
    >
    > What the user sees at this point is even though they have the
    > "GART error reporting enable" disabled, they still see "GART" errors.

    The GART error MCE does not work properly in K8. Normally the BIOS
    disables it, but some early kernels managed to still enable
    it through a backdoor.

    You can rule that out by using an recent 2.4 kernel.org kernel or
    the SLES8-SP3 kernel if you want a distribution kernel (no idea
    if RH has the fix or not)

    But it's possible that it's really a different MCE.

    >
    > If you are suggesting that there may be a real hardware error here that is
    > being misinterpreted by the kernel, my next course of action is to collect
    > that real error syndrome and decode it.

    Yes, that's a good idea.

    >
    > I can volunteer to assist with fixing this decoding function as well,
    > since I have a good test case here.

    We already have a patch for that, it just needs a bit more work
    before it can be merged.

    -Andi
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Thomas Zehetbauer: "AMD64: IDE performance woes"

    Relevant Pages

    • Re: Patricks Test9 suspend code.
      ... > usb and agp were both compiled in to the kernel that worked. ... > never seemed to be dying due to the HARDWARE, it always shut all the hardware ... You really should try without AGP. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: New mobo question
      ... >> I've ordered a new mobo as I'm having what appears to be data bus ... As far as I read him, he wants to build a kernel that runs on his new ... he has to just compile a kernel that includes hardware ... To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ...
      (Linux-Kernel)
    • Re: [PATCH] Configure IDE probe delays
      ... and anywhere in the code we take a big hit to make ... in the kernel where this would be useful. ... What about hardware that is ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: reiser4 plugins
      ... > PCI driver example to what we're discussing here. ... > around hardware bugs for a specific subset of hardware it needs to use ... > hardware is considered to be just fine in Linux kernel land aswell. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: 2.6.0-preX causes memory corruption
      ... which have occoured with Kernel 2.6.0-pre9/10 on my system. ... tar.bz2 files I still get the message thrown out that there is a problem ... this has problem exists on my old hardware as on my brand new ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)