Re: [PATCH 2/2] cciss: disable dma prefetch for P600



On Fri, 20 Oct 2006 14:56:18 -0500
"Mike Miller (OS Dev)" <mikem@xxxxxxxxxxxxxxxxxxxxxxx> wrote:

..


What is the actual problem which is being fixed here?

Sorry, I should have been clearer. There is a bug in the DMA engine that
that may result in prefetching data from beyond the end of memory or
falling off into one the holes on IPF and AMD. It causes a machine check
when that happens.
It doesn't happen on Proliant because the last 4kB (or so) of memory is
mapped out by the BIOS and Pentium guarantees contiguous memory.

I think that this:

#if defined(CONFIG_IA64) || defined(CONFIG_X86_64)

is nowhere near strong enough and is probably inappropriate.

It _could_ be that CONFIG_DISCONTIGMEM|CONFIG_SPARSEMEM will be closer, but
even CONFIG_FLATMEM systems can have holes.

I'm poking around on some IPF platforms. It looks like CONFIG_DISCONTIGMEM is
set on them, but not the others you mention. Would that be sufficient?

I don't think so. All machines in all memory models can and do have holes
in their memory map. I think the problem is that some machines object to
having those holes read from and others do not. It could be that this
problem is purely an ia64 thing.

And it's not just holes: we had a problem a year or so back where CPU
prefetching was walking off the end of real mmeory and into the AGP region
and was causing weird cache coherency problems on x86_64 (or something like
that).


On what machines can/does this card exist? Things like powerpc?

This problem was found on Itanium. We don't try to support powerpc.

Well the CCISS driver presently has no architecture Kconfig dependencies,
so anyone can build it on anything. I don't know whether it's physically
possible to put a cciss controller into a power/sparc/whatever machine -
are these controllers only ever integrated onto the main boad?

Anyway, I'd suggest the best way of sorting this out is to come up with a
complete description of the problem, decide which architectures are
affected and to then ask the relevant architecture maintainers to recommend
a solution.

I think the description would be

There is a bug in the DMA engine that that may result in prefetching
data from beyond the end of memory or falling off into one the holes on
IPF and AMD. It causes a machine check when that happens.

It doesn't happen on Proliant because the last 4kB (or so) of memory is
mapped out by the BIOS and Pentium guarantees contiguous memory.

If the platform is culnerable to this then driver's prefetching needs
to be disabled at compile-time or, preferably, initialization-time. What
is the best means by which we can determine whether the platform needs
this treatment?


(the patch didn't compile, btw: there's no definition of I2O0_DMA1_CFG)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: JCL
    ... The refrigerator sized cabinet not only housed this core memory, ... Speaking of paper tape, anyone ever have to bootstrap a machine by entering ... Another new fangled invention by IBM: 96 column cards! ... Not the standard rectangular 80 column ones with rectangular holes ...
    (comp.sys.hp.mpe)
  • Re: hash table sizes
    ... >> NUMA with memory holes between nodes at the moment, ... > holes just below 4GB regardless. ... do - were you seeing this on x440 or NUMA-Q? ... send the line "unsubscribe linux-kernel" in ...
    (Linux-Kernel)
  • Re: removing Unfilled ROM holes
    ... but the memory hole is 8K!?! ... but that can often leave holes in the memory ... At best you can combine Dlls together and link code statically ... > smaillet at EmbeddedFusion dot com ...
    (microsoft.public.windowsce.platbuilder)
  • Re: Kernel Dump
    ... > IA64 and Sparc systems usually had sparse memory configurations and the ... I reused most of the ia64 code. ... > The problem is that x86 machines are increasinly having memory holes. ...
    (freebsd-arch)
  • Re: cciss update for 2.4.24-pre1, #3
    ... We found a bug in the ASIC used on the 64xx Smart Array ... If this occurs on a memory boundary the machine will crash. ... This patch turns on prefetch for x86 based systems only. ... Has the prefetching been tested for long? ...
    (Linux-Kernel)