Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier



* Arkadiusz Miskiewicz (a.miskiewicz@xxxxxxxxx) wrote:
On Thursday 23 of April 2009, Mathieu Desnoyers wrote:
* Ingo Molnar (mingo@xxxxxxx) wrote:
* Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx> wrote:
" // Opteron Rev E has a bug in which on very rare occasions a locked
// instruction doesn't act as a read-acquire barrier if followed by a
// non-locked read-modify-write instruction. Rev F has this bug in
// pre-release versions, but not in versions released to customers,
// so we test only for Rev E, which is family 15, model 32..63
inclusive.

Dunno. The fix looks a bit intrusive (emits a NOP even on good
CPUs). Also, the text above says "not in versions released to
customers".

So unless there's an official erratum or reports in the field (not
from early prototype systems shipped to developers) i'd not rush to
apply it, just yet.

Actually, Operon Rev E has this bug in the field (family 15, model
32..64). Rev F only had the bug in pre-releases.

But yes, it's bad that it drags so many code additions to something as
critical as cmpxchg. I start to think it might be better to just
disallow bringing up more than one CPU on these machines.

That probably would be even worse than what we have now. This bug doesn't
manifest too often in a noticeable way here (I have few such machines here,
mostly 2 x dual core; once per few months mysql dies) and loosing 3 of 4 cores
(or 1 cpu of 2; depends on what you mean) doesn't sound like fun.


Having silent data corruption does not sound like fun neither. Another
alternative, when we detect those CPUs, is to printk a warning telling :

"AMD Opteron family X model Y is known to corrupt data on SMP due"
"to incorrect cmpxchg instruction memory barriers. Please contact"
"AMD for more information."

And activate the "tainted" kernel flag. This way, we won't be bothered
trying to fix AMD bugs, and it will officially become AMD's problem.

Mathieu

Mathieu


--
Arkadiusz Miśkiewicz PLD/Linux Team
arekm / maven.pl http://ftp.pld-linux.org/



--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
    ... // pre-release versions, but not in versions released to customers, ... CPUs). ... Rev F only had the bug in pre-releases. ...
    (Linux-Kernel)
  • Re: Larkin, Power BASIC cannot be THAT good:
    ... If they did not produce a product with *adequate* quality then customers would not buy it and the company would not make a profit. ... it is to change a product in the field, and Y axis is bug density. ... but when the in service fix is almost free to the supplier then they will exploit that to their advantage. ... On-screen programming is pretty much type and ignite and see what ...
    (sci.electronics.design)
  • Re: Special upgrade treatment
    ... If you search through this newsgroup, or google "LW 7.5 D morph mixer problems - HELP", you will find a request to confirm a bug in 7.5d, and your support department's "workaround", to wit, "... ... If technical support is recommending against any other | updates in that series or in any wider selected range of update | versions, then by all means I'd like to know about it, and I'd like to | know why. ... customers, and the quality of LW releases is of public interest. ...
    (comp.graphics.apps.lightwave)
  • Re: VMS and HPVM
    ... VMS only sees 4 CPUs ... applications on VMS for 64 CPUs. ... but it CAN be of great value to certain specific customers. ...
    (comp.os.vms)
  • Re: [Full-Disclosure] Disclose a bug, do not pass go, go directly to jail
    ... so he emailed the customers to only let them know about the bug. ... Disclose a bug, do not pass go, go directly to ... > be a civil case against Mr. McDanel, since he worked for Tornado and ...
    (Full-Disclosure)

Loading