Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
- From: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx>
- Date: Thu, 23 Apr 2009 18:17:11 -0400
* Arkadiusz Miskiewicz (a.miskiewicz@xxxxxxxxx) wrote:
On Thursday 23 of April 2009, Mathieu Desnoyers wrote:
* Ingo Molnar (mingo@xxxxxxx) wrote:
* Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx> wrote:
" // Opteron Rev E has a bug in which on very rare occasions a locked
// instruction doesn't act as a read-acquire barrier if followed by a
// non-locked read-modify-write instruction. Rev F has this bug in
// pre-release versions, but not in versions released to customers,
// so we test only for Rev E, which is family 15, model 32..63
inclusive.
Dunno. The fix looks a bit intrusive (emits a NOP even on good
CPUs). Also, the text above says "not in versions released to
customers".
So unless there's an official erratum or reports in the field (not
from early prototype systems shipped to developers) i'd not rush to
apply it, just yet.
Actually, Operon Rev E has this bug in the field (family 15, model
32..64). Rev F only had the bug in pre-releases.
But yes, it's bad that it drags so many code additions to something as
critical as cmpxchg. I start to think it might be better to just
disallow bringing up more than one CPU on these machines.
That probably would be even worse than what we have now. This bug doesn't
manifest too often in a noticeable way here (I have few such machines here,
mostly 2 x dual core; once per few months mysql dies) and loosing 3 of 4 cores
(or 1 cpu of 2; depends on what you mean) doesn't sound like fun.
Having silent data corruption does not sound like fun neither. Another
alternative, when we detect those CPUs, is to printk a warning telling :
"AMD Opteron family X model Y is known to corrupt data on SMP due"
"to incorrect cmpxchg instruction memory barriers. Please contact"
"AMD for more information."
And activate the "tainted" kernel flag. This way, we won't be bothered
trying to fix AMD bugs, and it will officially become AMD's problem.
Mathieu
Mathieu
--
Arkadiusz Miśkiewicz PLD/Linux Team
arekm / maven.pl http://ftp.pld-linux.org/
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- References:
- [patch 0/2] Fixing AMD cmpxchg "missing lfence" mess
- From: Mathieu Desnoyers
- Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
- From: Ingo Molnar
- Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
- From: Mathieu Desnoyers
- Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
- From: Arkadiusz Miskiewicz
- [patch 0/2] Fixing AMD cmpxchg "missing lfence" mess
- Prev by Date: Re: [PATCH] x86 microcode: work_on_cpu and cleanup of the synchronization logic
- Next by Date: Re: [RFC PATCH 1/2] pci: don't assume pref memio are 64bit -v3
- Previous by thread: Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
- Next by thread: Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
- Index(es):
Relevant Pages
|
Loading