MMIO and gcc re-ordering (Was: [PATCH] [POWERPC] Improve (in|out)_beXX() asm code)




IE. Take an x86 version of that test, writing to memory, doing a writel
to some MMIO, then another memory write, can those be re-ordered with
the current x86 version of writel ?

Yes, the same thing can happen on x86. As far as I could tell, this is
something that all other arches can have happen. Usually aliasing prevents
it, but it's not hard to constuct a test case where it doesn't.

That brings us back to the old debate...

For consistent memory, should we mandate a wmb between write to some dma
data structure in consistent memory and the following mmio write that
trigger it, and the same goes with rmb and read ?

David, you remember we had those discussions a while back when I was
trying to relax a bit the barriers we have in our MMIO accessors on
powerpc, and the overwhelming answer was that x86 being in order, I have
to expect 90% of the drivers to not get any barrier right on any
platform, and thus I should make my MMIO accessors "look like" x86 and
thus ordered.

We did that, adding some barriers in the assembly of our readl/writel.

However, I didn't change the clobber, it's still *addr, not a full
"memory" clobber, just like x86.

Now if it appears that gcc can indeed re-order things, then we have a
problem on both x86, ppc, and pretty much everybody else. (I'm not sure
about sparc but I don't see any explicit clobber in your accessors
there).

So that brings the whole subject back imho. What should be the approach
here ? I see several options:

- mandate some kind of dma_sync_for_device/cpu on consistent memory.
Almost no driver do that currently tho. They only do that for non
consistent memory mapped with dma_map_*.

- mandate the use of wmb,rmb,mb barriers for use between memory
accesses and MMIOs for ordering them. (ie. fix drivers that don't do
it). Advantage for powerpc is that I can remove (after some auditing of
course) the added heavy barriers in the MMIO accessors themselves.

- stick a full memory clobber in all MMIO (and PIO) accessors on all
archs.

Any other idea ? preference ?

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: 16/32 processor operating mode
    ... Okay, x86 it is. ... hardware perspective, as I already mentioned, most memory accesses are ... called a "cache line") in a single operation. ... much of the data bus is active when accessing stuff on the bus. ...
    (alt.lang.asm)
  • Re: [RFC] MMIO accessors & barriers documentation
    ... between a memory store and an MMIO store. ... with write accessors of Class 2 and 3. ...
    (Linux-Kernel)
  • Re: [RFC] MMIO accessors & barriers documentation
    ... between descriptor updates in memory and the mmio to kick them? ... followed by MMIO store, that is does ia64 -current- accessors provide ... the barriers allowing to implement those ordering rules when the ...
    (Linux-Kernel)
  • Re: Zones in Linux
    ... called as NORMAL, DMA, HIGH memory zones. ... In that author specified that x86 won't be able to access above 868MB. ... All 32 bit x86 processors are able to access at least 4 GB of physical memory. ... With PAE, x86 processors are able to address 64GB of physical memory, although in all cases, 4GB is the virtual address range. ...
    (comp.os.linux.development.system)
  • Re: [RFC] MMIO accessors & barriers documentation
    ... intended to provide memory vs. I/O ordering, but isn't PPC the only platform ... case where ordering MMIO W + memory W is useful and I can't see any ... Those are the ones biting us on PowerPC (we haven't seen the lock ... problem but then it can't happen the way our current accessors are ...
    (Linux-Kernel)