Re: [patch 1/2] i386: use asm() like the other atomic operations already do.



* Satyam Sharma | 2007-08-15 18:32:10 [+0530]:

On Wed, 15 Aug 2007, Herbert Xu wrote:

Andi Kleen <ak@xxxxxxx> wrote:
My config with march=pentium-m and gcc (GCC) 4.1.2 (Gentoo 4.1.2):
text data bss dec hex filename
3434150 249176 176128 3859454 3ae3fe atomic_normal/vmlinux
3435308 249176 176128 3860612 3ae884 atomic_inlineasm/vmlinux

What is the difference between atomic_normal and atomic_inlineasm?

The inline asm stops certain optimisations from occuring.

Yup, the "__volatile__" after "__asm__" shouldn't be required. The
previous definitions allowed the compiler to optimize certain atomic
ops away, and considering we haven't seen any bugs due to the past
behaviour anyway, changing it like this sounds unnecessary.

The second behavioral change that comes about here is the force-
constraining of v->counter to memory in the extended asm's constraint
sets. But that's required to give "volatility"-like behaviour, which
I suspect is the goal of this patch.
exactly.

Sebastian, could you look at the kernel images in detail and see why,
how and what in the text expanded by 1158 bytes with these changes?
Wow, easy. "normal" is Linus' git. It has no volatile or anything. I
suspect the compiler to "optimize" some of the reads and sets away
(which remain after my patch is applied).
Anyway, I tried to find out the source of this and it is not that easy.
If the text is getting bigger, all references to the data section are
changed and I can't simply diff. Than I searched for two "equal" .o
files which differ in size. This was not better because gcc reorganized
the whole .o file and the functions that were at the beginning in the
first file, were at a different position in the second one.
For those who have plenty of time in their hands, I posted the four
kernels [1] and my four O= folders [2].
Once I have some time, I try to diff the specific function within the .o
file which used the atomic_{read|set} operation.

I'm still unconvinced why we need this because nobody has
brought up any examples of kernel code that legitimately
need this.

Me too, but I do find these more palatable than the variants using
"volatile", I admit.
Yes. The inline asm is even smaller than the volatile ones, what sounds
like better optimization by the gcc (and it does exactly what volatile
was used for).

[1] http://download.breakpoint.cc/kernel_builds_vmlinux.tbz2 8.3 MiB
[2] http://download.breakpoint.cc/kernel_builds.tbz2 117 MiB

Satyam

Sebastian
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: A question of style, or something more....
    ... .so under an optimising compiler the actual code would be identical. ... Javascript is delivered to its execution environment as source code ... So an important criteria for a "good" javascript complier is that it ... required in order to apply optimisations. ...
    (comp.lang.javascript)
  • Re: HLA
    ... The inline asm feature is a feature of the HLL compiler. ... absolutely nothing to do with any assembler. ... I was referring to adding HLA into another C compiler using HLA as the ...
    (alt.lang.asm)
  • Re: A question of style, or something more....
    ... .so under an optimising compiler the actual code would be identical. ... Javascript is delivered to its execution environment as source code ... So an important criteria for a "good" javascript complier is that it ... required in order to apply optimisations. ...
    (comp.lang.javascript)
  • Re: So much for Opteron 32bit compatibility ...
    ... Sun's SPECfp and SPECint performance improvements are entirely ... The optimisations have been extensively discussed ... >> and they are not SPEC specific and are usefull outside the ... performance from a combination of processor design and compiler ...
    (comp.os.vms)
  • Re: Use of TM mechanisms for OoO speculation recovery?
    ... If you take a look at typical code, most such optimisations ... variable accesses by subroutines that limit compiler optimizations? ... My proposed solution to the above was, for my language, ... The routine side effect attributes describe how it accessed any ...
    (comp.arch)