Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures





On Wed, 15 Aug 2007, Segher Boessenkool wrote:

"Volatile behaviour" itself isn't consistently defined (at least
definitely not consistently implemented in various gcc versions across
platforms),

It should be consistent across platforms; if not, file a bug please.

but it is /expected/ to mean something like: "ensure that
every such access actually goes all the way to memory, and is not
re-ordered w.r.t. to other accesses, as far as the compiler can take
^
(volatile)

(or, alternatively, "other accesses to the same volatile object" ...)

care of these". The last "as far as compiler can take care" disclaimer
comes about due to CPUs doing their own re-ordering nowadays.

You can *expect* whatever you want, but this isn't in line with
reality at all.

volatile _does not_ prevent reordering wrt other accesses.
[...]
What volatile does are a) never optimise away a read (or write)
to the object, since the data can change in ways the compiler
cannot see; and b) never move stores to the object across a
sequence point. This does not mean other accesses cannot be
reordered wrt the volatile access.

If the abstract machine would do an access to a volatile-
qualified object, the generated machine code will do that
access too. But, for example, it can still be optimised
away by the compiler, if it can prove it is allowed to.

As (now) indicated above, I had meant multiple volatile accesses to
the same object, obviously.

BTW:

#define atomic_read(a) (*(volatile int *)&(a))
#define atomic_set(a,i) (*(volatile int *)&(a) = (i))

int a;

void func(void)
{
int b;

b = atomic_read(a);
atomic_set(a, 20);
b = atomic_read(a);
}

gives:

func:
pushl %ebp
movl a, %eax
movl %esp, %ebp
movl $20, a
movl a, %eax
popl %ebp
ret

so the first atomic_read() wasn't optimized away.


volatile _does not_ make accesses go all the way to memory.
[...]
If you want stuff to go all the way to memory, you need some
architecture-specific flush sequence; to make a store globally
visible before another store, you need mb(); before some following
read, you need mb(); to prevent reordering you need a barrier.

Sure, which explains the "as far as the compiler can take care" bit.
Poor phrase / choice of words, probably.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: volatile semantics
    ... Anything else relies upon the compiler writers making the ... A volatile qualifier won't protect you against any optimisations being ... accesses are ordered with respect to "volatile" accesses. ... That's a list of the sequence points. ...
    (comp.arch.embedded)
  • Re: volatile semantics
    ... Anything else relies upon the compiler writers making the ... A volatile qualifier won't protect you against any optimisations being ... accesses are ordered with respect to "volatile" accesses. ... extern volatile bool testPin; ...
    (comp.arch.embedded)
  • Re: [PATCH tip/core/rcu 08/10] rcu: Add a TINY_PREEMPT_RCU
    ... I'd be concerned by the fact that there is no strong ordering guarantee ... My concern is that the compiler might be allowed to turn your code into: ... previous accesses to volatile objects have stabilized and no subsequent ...
    (Linux-Kernel)
  • Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
    ... The compiler can also reorder non-volatile accesses. ... (which you definitely _do_ care about). ... The compiler is prohibited from moving a volatile access across a sequence ...
    (Linux-Kernel)
  • Re: Memory visibility and MS Interlocked instructions
    ... >>> But this would require that MS insert membars on all volatile ... >>> accesses, because there is, in general, no way to know whether ... >>> another part of the program uses an interlocked instruction. ... I'm not sure about other platforms. ...
    (comp.programming.threads)