Re: [rfc][patch 3/3] x86: optimise barriers



On Tue, Oct 16, 2007 at 02:50:33AM +0200, Nick Piggin wrote:
On Mon, Oct 15, 2007 at 11:10:00AM +0200, Jarek Poplawski wrote:
...
I'm not performance-words at all, so I can't help you, sorry. But, I
understand people who care about this, and think there is a popular
conviction barriers and locked instructions are costly, so I'm
surprised there is any "problem" now with finding these gains...

It's more expensive than nothing, sure. However in real code, algorithmic
complexity, cache misses and cacheline bouncing tend to be much bigger
issues.

I can't think of a place in the kernel where smp_rmb matters _that_ much.
seqlocks maybe (timers, dcache lookup), vmscan... Obviously removing the
lfence is not going to hurt. Maybe we even gain 0.01% performance in
someone's workload.

Also, remember: if loads are already in-order, then lfence is a noop,
right? (in practice it seems to have to do a _little_ bit of work, but
it's like a dozen cycles).

You are right: considering current CPUs there could be no performance
problem at all. Removing LOCKs for older ones should probably matter
more, but as a matter of fact, now I wouldn't bet even on this - it
could be mostly noop as well.

As a matter of fact it's not natural for me at all. I expected the
other direction, and I still doubt programmers' intentions could be
"automatically" predicted good enough, so IMHO, it's not for long.

Really? Consider the consequences if, instead of releasing this latest
document tightening consistency, Intel found that out of order loads
were worth 5% more performance and implemented them in their next chip.
The chip could be completely backwards compatible, but all your old code
would break, because it was broken to begin with (because it was outside
the spec).

I've different opinion on this: I expect any spec to describe current
implementation. Before issuing new models any changes of
implementation should be made public with proper margin of time. Then
system could be optimally adjusted to a real hardware, instead of
planned only, but possibly never realized (plus doing such not used
things with old means is usually more costly: lock vs. lfence). There
is still problem of specs' completness: there are probably often some
things unspecified which could brake on a new model, so never 100%
guarantee anyway.

IMO Intel did exactly the right thing from an engineering perspective,
and so did Linux to always follow the spec.

But, if you follow the spec - you don't follow the spec! Why do you
ignore so much this part of Intel's spec:

"This document contains information which Intel may change at any
time without notice. Do not finalize a design with this information."

Maybe it's a real Intel intention and not for lawyers only? (Btw, it
seems we have an example.)

Regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [rfc][patch 3/3] x86: optimise barriers
    ... and I still doubt programmers' intentions could be ... The chip could be completely backwards compatible, ... what you don't realize is that Intel have built their business on makeing sure that their new CPU's run existing software with no modifications,. ... and so did Linux to always follow the spec. ...
    (Linux-Kernel)
  • Re: Intel Inside no more
    ... >compiler tricks and architectures geared towards achieving high SPEC ... >server chips competed with each other. ... When compared to AMD the clock speed difference wasn't as large, ... However they did so a 7 months after Intel had hit ...
    (comp.sys.ibm.pc.hardware.chips)
  • Re: Please Help - Intel D975XBX Conroe X6800
    ... The 1073 BIOS does not make sense there is no such thing for the Intel ... I am building a new High spec PC and cannot get it to boot. ... I have disconnected everything but the Graphics card, memory and CPU ...
    (alt.comp.hardware.pc-homebuilt)
  • Re: Intel Inside no more
    ... > compiler tricks and architectures geared towards achieving high SPEC ... the point is that as a result, AMD got shanked and had big problems ... Intel had better ASPs and margins on the mobile segment than ... >> Now today, AMD has made inroads into the server market, but you really ...
    (comp.sys.ibm.pc.hardware.chips)
  • Re: Please Help - Intel D975XBX Conroe X6800
    ... not boot in an Intel D975XBX motherboard with any Intel ... I am building a new High spec PC and cannot get it to boot. ... Intel Conroe Core 2 Extreme X6800 CPU ... If I remove the memory altogether I get the memory failed beeps. ...
    (alt.comp.hardware.pc-homebuilt)

Loading