Re: [sched, patch] better wake-balancing, #3

From: Ingo Molnar (mingo_at_elte.hu)
Date: 07/30/05

  • Next message: Chuck Ebbert: "i387 floating point benchmark/test v0.11"
    Date:	Sat, 30 Jul 2005 09:19:17 +0200
    To: Nick Piggin <nickpiggin@yahoo.com.au>
    
    

    * Nick Piggin <nickpiggin@yahoo.com.au> wrote:

    > > here's an updated patch. It handles one more detail: on SCHED_SMT we
    > > should check the idleness of siblings too. Benchmark numbers still
    > > look good.
    >
    > Maybe. Ken hasn't measured the effect of wake balancing in 2.6.13,
    > which is quite a lot different to that found in 2.6.12.
    >
    > I don't really like having a hard cutoff like that -wake balancing can
    > be important for IO workloads, though I haven't measured for a long
    > time. [...]

    well, i have measured it, and it was a win for just about everything
    that is not idle, and even for an IPC (SysV semaphores) half-idle
    workload i've measured a 3% gain. No performance loss in tbench either,
    which is clearly the most sensitive to affine/passive balancing. But i'd
    like to see what Ken's (and others') numbers are.

    the hard cutoff also has the benefit that it allows us to potentially
    make wakeup migration _more_ agressive in the future. So instead of
    having to think about weakening it due to the tradeoffs present in e.g.
    Ken's workload, we can actually make it stronger.

    > [...] In IPC workloads, the cache affinity of local wakeups becomes
    > less apparent when the runqueue gets lots of tasks on it, however
    > benefits of IO affinity will generally remain. Especially on NUMA
    > systems.

    especially on NUMA, if the migration-target CPU (this_cpu) is not at
    least partially idle, i'd be quite uneasy to passive balance from
    another node. I suspect this needs numbers from Martin and John?

    > fork/clone/exec/etc balancing really doesn't do anything to capture
    > this kind of relationship between tasks and between tasks and IRQ
    > sources. Without wake balancing we basically have a completely random
    > scattering of tasks.

    Ken's workload is a heavy IO one with lots of IRQ sources. And precisely
    for such type of workloads usually the best tactic is to leave the task
    alone and queue it wherever it last ran.

    whenever there's a strong (and exclusive) relationship between tasks and
    individual interrupt sources, explicit binding to CPUs/groups of CPUs is
    the best method. In any case, more measurements are needed.

            Ingo
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Chuck Ebbert: "i387 floating point benchmark/test v0.11"

    Relevant Pages

    • Re: [sched, patch] better wake-balancing, #3
      ... I meant: measured for IO workloads. ... > which is clearly the most sensitive to affine/passive balancing. ... Without wake balancing we basically have a completely random ... In any case, more measurements are needed. ...
      (Linux-Kernel)
    • RE: Delete scheduler SD_WAKE_AFFINE and SD_WAKE_BALANCE flags
      ... >>benefit with zero aggressiveness. ... type of workload has detrimental effect on another workload and vice versa. ... > of the zero balancing case, then that would be preferable I think. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • [patch] remove wake-balancing
      ... >> of the zero balancing case, then that would be preferable I think. ... > type of workload you quoted earlier and for db workload, ... i think we could try to get rid of wakeup-time balancing altogether. ...
      (Linux-Kernel)
    • Re: [Lse-tech] [patch] sched-domain cleanups, sched-2.6.5-rc2-mm2-A3
      ... it all depends on the workload i guess, ... scales well then the threads only share data in a read-mostly manner - ... things like JVMs tend to want good balancing - they really are userspace ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)