Re: [PATCH] disable CPU side GART accesses



Ingo Molnar wrote:
(Cc:-ed the GART folks.)

* Bob Montgomery <bob.montgomery@xxxxxx> wrote:

This patch prevents improper access of the GART aperture from kdump
kernels running on AMD systems.

Symptoms of the problem include hangs, spurious restarts, and MCE
(Machine Check Exception) panics in some AMD Opteron systems that
enable the GART IOMMU and access /proc/vmcore or /dev/oldmem from a
kdump kernel. Note that the GART IOMMU will not be enabled on systems
with less than 4 GB of RAM, so symptoms will not appear. This problem
has been reproduced on Family 10H Quad-Core AMD Opteron systems.

This patch changes the initialization of the GART to set the
DISGARTCPU bit in the GART Aperture Control Register
(AMD64_GARTAPERTURECTL). Setting the bit prevents requests from the
CPUs from accessing the GART. In other words, CPU memory accesses to
the aperture address range will not cause the GART to perform an
address translation. The aperture area is currently being unmapped at
the kernel level with set_memory_np() in gart_iommu_init to prevent
accesses from the CPU, but that kernel level unmapping is not in
effect in the kexec'd kdump kernel. By disabling the CPU-side
accesses within the GART, which does persist through the kexec of the
kdump kernel, the kdump kernel is prevented from interacting with the
GART during accesses to the dump memory areas which include the
address range of the GART aperture. Although the patch can be applied
to the kdump kernel, it is not exercised there because the kdump
kernel doesn't attempt to initialize the GART, since it typically runs
in less than 4 GB of memory.

how about area is not used by IOMMU in GART?

/*
* Unmap the IOMMU part of the GART. The alias of the page is
* always mapped with cache enabled and there is no full cache
* coherency across the GART remapping. The unmapping avoids
* automatic prefetches from the CPU allocating cache lines in
* there. All CPU accesses are done via the direct mapping to
* the backing memory. The GART address is only used by PCI
* devices.
*/
set_memory_np((unsigned long)__va(iommu_bus_base),
iommu_size >> PAGE_SHIFT);

the code only set np to the iommu window.

also following patch should fix the problem with kexec/kdump already. that patch is in mainline from 2.6.25-rc1.

YH

commit aaf230424204864e2833dcc1da23e2cb0b9f39cd
Author: Yinghai Lu <Yinghai.Lu@xxxxxxx>
Date: Wed Jan 30 13:33:09 2008 +0100

x86: disable the GART early, 64-bit

For K8 system: 4G RAM with memory hole remapping enabled, or more than
4G RAM installed.

when try to use kexec second kernel, and the first doesn't include
gart_shutdown. the second kernel could have different aper position than
the first kernel. and second kernel could use that hole as RAM that is
still used by GART set by the first kernel. esp. when try to kexec
2.6.24 with sparse mem enable from previous kernel (from RHEL 5 or SLES
10). the new kernel will use aper by GART (set by first kernel) for
vmemmap. and after new kernel setting one new GART. the position will be
real RAM. the _mapcount set is lost.

Bad page state in process 'swapper'
page:ffffe2000e600020 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 0, comm: swapper Not tainted 2.6.24-rc7-smp-gcdf71a10-dirty #13

Call Trace:
[<ffffffff8026401f>] bad_page+0x63/0x8d
[<ffffffff80264169>] __free_pages_ok+0x7c/0x2a5
[<ffffffff80ba75d1>] free_all_bootmem_core+0xd0/0x198
[<ffffffff80ba3a42>] numa_free_all_bootmem+0x3b/0x76
[<ffffffff80ba3461>] mem_init+0x3b/0x152
[<ffffffff80b959d3>] start_kernel+0x236/0x2c2
[<ffffffff80b9511a>] _sinittext+0x11a/0x121

and
[ffffe2000e600000-ffffe2000e7fffff] PMD ->ffff81001c200000 on node 0
phys addr is : 0x1c200000

RHEL 5.1 kernel -53 said:
PCI-DMA: aperture base @ 1c000000 size 65536 KB

new kernel said:
Mapping aperture over 65536 KB of RAM @ 3c000000

So could try to disable that GART if possible.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • [PATCH] disable CPU side GART accesses
    ... This patch prevents improper access of the GART aperture from kdump ... The aperture area is currently being unmapped at the kernel level ...
    (Linux-Kernel)
  • Re: [PATCH] disable CPU side GART accesses
    ... DISGARTCPU bit in the GART Aperture Control Register ... effect in the kexec'd kdump kernel. ... kdump kernel, the kdump kernel is prevented from interacting with the ...
    (Linux-Kernel)
  • Re: [PATCH] x86-64: disable the GART early v2
    ... 4G RAM with memory hole remapping enabled, ... when try to use kexec second kernel, and the first doesn't include gart_shutdown. ... kernel could use that hole as RAM that is still used by GART set by the first kernel. ... extern void finish_e820_parsing; ...
    (Linux-Kernel)
  • Re: [PATCH] disable CPU side GART accesses
    ... kdump kernel. ... Note that the GART IOMMU will not be enabled on systems ... has been reproduced on Family 10H Quad-Core AMD Opteron systems. ... effect in the kexec'd kdump kernel. ...
    (Linux-Kernel)
  • Re: agpgart aperture check fails with kernels 2.6.22.4-65.fc7 and 2.6.22.5-71.fc7 [SOLVED]
    ... To be able to use the box at all I must perform a cold boot, ... "No usable aperture found." ... agpgart aperture check has worked in this box through all kernels since ... warm-boot the computer since kernel 2.6.22.4-65.fc7.x86_84 update. ...
    (Fedora)