Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- From: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx>
- Date: Tue, 17 Aug 2010 11:55:21 -0400
* Steven Rostedt (rostedt@xxxxxxxxxxx) wrote:
On Tue, 2010-08-17 at 10:16 -0400, Mathieu Desnoyers wrote:
* Steven Rostedt (rostedt@xxxxxxxxxxx) wrote:
If we are this concerned, what about just doing:
--t->rcu_read_lock_nesting;
if (ACCESS_ONCE(t->rcu_read_lock_nesting) == 0 &&
unlikely((ACCESS_ONCE(t->rcu_read_unlock_special)))
I'd be concerned by the fact that there is no strong ordering guarantee
that the non-volatile --t->rcu_read_lock_nesting is done before
ACCESS_ONCE(t->rcu_read_unlock_special).
My concern is that the compiler might be allowed to turn your code into:
if (ACCESS_ONCE(t->rcu_read_lock_nesting) == 1 &&
unlikely((ACCESS_ONCE(t->rcu_read_unlock_special))) {
--t->rcu_read_lock_nesting;
do_something();
} else
--t->rcu_read_lock_nesting;
That just seems to break all sorts of rules.
Is there a rule stating that a non-volatile access is ensured to be in
issued in program order wrt other accesses to that same variable ?
The stardard discourages compilers to do any kind of optimization when
volatile is detected on a variable
(http://gcc.gnu.org/onlinedocs/gcc/Volatiles.html), but that's a very
weak guarantee I would not like to rely on.
The only strong guarantee it provides is:
"The minimum either standard specifies is that at a sequence point all
previous accesses to volatile objects have stabilized and no subsequent
accesses have occurred."
So the question here is: given that the "--t->rcu_read_lock_nesting"
access is not marked volatile, but the following
"ACCESS_ONCE(t->rcu_read_lock_nesting)" read is volatile, should we
consider than "--t->rcu_read_lock_nesting" apply to a volatile object or
not ?
This is the kind of grey zone I dislike.
So whether or not this could be done by the compiler depending on the
various definitions of volatile, I strongly recommend against using
volatile accesses to provide compiler ordering guarantees. It is bad in
terms of code documentation (we don't document _what_ is ordered) and it
is also bad because the volatile ordering guarantees seems to be
very easy to misinterpret.
Yes, volatile does not guarantee ordering of other accesses, but it
should at least guarantee ordering of access to the thing that is
volatile.
b++;
a++;
c = ACCESS_ONCE(a);
'b++' can be moved to anywhere. But I'm pretty sure the compiler is not
allowed to move the 'a++' after the ACCESS_ONCE(a) because it is the
thing that is volatile.
AFAIU, the standard only requires ordering between volatile "objects".
So when we cast non-volatile objects to volatile, I assume the
non-volatile accesses apply only to the non-volatile version of the
object. So volatile ordering guarantees would not apply to "a".
We are telling the compiler that 'a' can change
outside our scope, which to me is the same as doing:
a++;
c = some_global_function(&a);
Where, the compiler does not know the result of 'a' and can not move the
'a++'.
The code here is different: calling an external function is equivalent
to put a full compiler memory barrier ("memory" clobber). Volatile is
quite different from that; the compiler is only told to keep ordering
between volatile accesses, and to try not to optimize the volatile
access per se.
Maybe I'm wrong, and need to verify this with a compiler expert. But
what's the use of volatile if it can't protect the ordering of what is
volatile from itself.
I'm concerned about the fact that volatile seems to have been initially
designed to apply to a variable declaration (object) rather than a
specific access through a cast. So I really wonder if the non-volatile
object accesses has the same guarantees as the accesses performed on its
volatile cast version.
ACCESS_ONCE() should be only that: a macro that tells the access should
be performed only once. Why are we suddenly presuming it should have any
ordering semantic ?
Only ordering with the variable that is volatile. It has no ordering to
any other variable.
This is not quite correct. Volatile orders with respect to _all_ other
volatile accesses.
What I am pointing out here is that the macro name "ACCESS_ONCE()" does
not convey any ordering semantic, and should not.
It should be totally valid to create arch-specific ACCESS_ONCE() macros
that only perform the "read once", without the ordering guarantees
provided by the current ACCESS_ONCE() "volatile" implementation. The
following code is only for unsigned long, but you get the idea: there is
no volatile at all, and I ensure that "val" is only read once by using
the "+m" (val) constraint, telling the compiler (falsely) that the
assembler is modifying the value (it therefore has a side-effect), so
gcc won't be tempted to re-issue the assembly statement.
static inline unsigned long arch_access_once(unsigned long val)
{
unsigned long ret;
#if (__BITS_PER_LONG == 32)
asm ("movl %1,%0": "=r" (ret), "+m" (val));
#else
asm ("movq %1,%0": "=r" (ret), "+m" (val));
#endif
}
Heck, this is too much micro optimization. We could just be safe and do
the:
--t->rcu_read_lock_nesting;
barrier();
if (ACCESS_ONCE(t->rcu_read_lock_nesting) == 0 &&
unlikely((ACCESS_ONCE(t->rcu_read_unlock_special)))
And be done with it.
Then we could go for the simpler:
--t->rcu_read_lock_nesting;
barrier();
if (t->rcu_read_lock_nesting == 0 &&
unlikely((t->rcu_read_unlock_special))
Which puts a constraint across all memory accesses. I'd be fine with
that if you are afraid of too much micro-optimization (e.g. my
barrier2(a, b) proposal).
Thanks,
Mathieu
-- Steve
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- From: Steven Rostedt
- Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- References:
- Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- From: Mathieu Desnoyers
- Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- From: Paul E. McKenney
- Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- From: Mathieu Desnoyers
- Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- From: Paul E. McKenney
- Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- From: Mathieu Desnoyers
- Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- From: Paul E. McKenney
- Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- From: Mathieu Desnoyers
- Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- From: Steven Rostedt
- Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- From: Mathieu Desnoyers
- Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- From: Steven Rostedt
- Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- Prev by Date: Re: [RFC] - Mapping ACPI tables as CACHED
- Next by Date: Re: bug LINUX_VERSION_CODE undefined
- Previous by thread: Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- Next by thread: Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
- Index(es):
Relevant Pages
|