Re: [rfc patch] i386: don't save eflags on task switch



Chuck Ebbert wrote:
In-Reply-To: <Pine.LNX.4.64.0611031645141.25218@xxxxxxxxxxx>

On Fri, 3 Nov 2006 16:46:25 -0800, Linus Torvalds wrote:

On Fri, 3 Nov 2006, Chuck Ebbert wrote:
There is no real need to save eflags in switch_to(). Instead,
we can keep a constant value in the thread_struct and always
restore that.
I don't really see the point. The "pushfl" isn't the expensive part, and it gives sane and expected semantics.

The "popfl" is the expensive part, and that's the thing that can't really even be removed.

Well that wasn't the impression I got:

Date: Mon, 18 Sep 2006 12:12:51 -0400
From: Benjamin LaHaise <bcrl@xxxxxxxxx>
Subject: Re: Sysenter crash with Nested Task Bit set

...

It's the pushfl that will be slow on any OoO CPU, as it has dependancies on any previous instructions that modified the flags, which ends up bringing all of the memory ordering dependancies into play. Doing a popfl to set the flags to some known value is much less expensive.

That doesn't sound correct to me. The popf is far more expensive. There is no popfl $IMM instruction, so setting flags never can avoid the memory read and must make some more expensive assumptions about effects on further instruction stream (TF, DF, all sign flags for conditional jumps...).

Every processor I've ever measured it on, popf is slower. On P4, for example, pushf is 6 cycles, and popf is 54. On Opteron, it is 2 / 12. On Xeon, it is 7 / 91.

Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [rfc patch] i386: dont save eflags on task switch
    ... The "popfl" is the expensive part, and that's the thing that can't really ... It's the pushfl that will be slow on any OoO CPU, ... all of the memory ordering dependancies into play. ... flags to some known value is much less expensive. ...
    (Linux-Kernel)
  • Re: Hi
    ... ARM9 that is an entirely different instruction as on an ARM2 (in ... mode bits or flags are is affected. ... Yes the original ARM compilers could use BL in conditional execution. ...
    (comp.sys.arm)
  • Re: Mouse?!
    ... the "TEST" instruction can test individual bits in a single ... so it can only be zero or one ... except that it only alters the FLAGS register and does not save the result ... this constant can be used to "mask out" only ...
    (alt.lang.asm)
  • Re: Detecting 32-bit hardware
    ... I don't think so - the first instruction must guarantee to set at least one ... It is correct to say that the first TEQ is unnecessary if you can guarantee ... that at least one of the flags is set. ...
    (comp.sys.acorn.programmer)
  • Re: Formatting in assembly
    ... really looked how the flags are set after a mul instruction). ... I'm in the habit of doing "or reg, reg" instead of "cmp reg, ... but we may or may not know all of the subtle effects - like the flags. ...
    (alt.lang.asm)