Re: [patch] voluntary-preempt-2.6.8-rc2-J3

From: Ingo Molnar (mingo_at_elte.hu)
Date: 07/26/04

  • Next message: Andrew Morton: "Re: Autotune swappiness01"
    Date:	Mon, 26 Jul 2004 22:36:34 +0200
    To: Andrew Morton <akpm@osdl.org>
    
    

    * Andrew Morton <akpm@osdl.org> wrote:

    > The bigger this thing gets, the more worried I get. Sometime this is
    > going to need to be split up into individual fixes, and they need to
    > be based upon an overall approach which we haven't yet settled on.

    i will do that splitup. Right now i'm simply mapping how widespread the
    problem is and what type of fixes we need. The situation isnt all that
    bad but we might need (an optional) mechanism to make softirqs
    synchronous. All of this stuff is nicely modular and i'll do a splitup
    post 2.6.8 (i dont think we want to disturb 2.6.8 with any of this).

    > In particular your whole approach (with voluntary_need_resched())
    > doesn't work on SMP.

    (for the record, voluntary_need_resched() == need_resched() - i'm only
    keeping a distinction between the two to be able to track progress and
    regressions between the vanilla and modified kernels.)

    need_resched() indeed doesnt do a lock-break for SMP purposes.

    > The approach I'm using is to unconditionally drop locks on every Nth pass
    > around the loop to allow another CPU to grab the lock, do some work, drop
    > the lock, then be preempted. eg:
    >
    > @@ -773,6 +774,12 @@ int get_user_pages(struct task_struct *t
    > struct page *map = NULL;
    > int lookup_write = write;
    >
    > + if ((++nr_pages & 63) == 0) {
    > + spin_unlock(&mm->page_table_lock);
    > + cpu_relax();
    > + spin_lock(&mm->page_table_lock);
    > + }
    > +

    guaranteeing latencies on SMP is hard, because the latency of a CPU
    might depend on the latency of a task on another CPU - which CPU isnt
    notified of the rescheduling request.

    one alternative technique to yours would be to notify _all_ CPUs that a
    high-prio RT task is ready to run (via a broadcast need-resched). That
    way the UP latency-break techniques map to SMP in a 1:1 way.

    non-RT tasks dont get this benefit, which is a difference to the UP
    situation, but i dont think it would be appropriate to use the UP
    behavior, due to the overhead of broadcasting.

    a combination of the two techniques could be used too: a global 'break
    locks from now on' flag which gets set if a (RT?) task wants to
    reschedule. Normally this flag would be zero and the cacheline would be
    clean and shared between all CPUs, causing no overhead. Once a task
    wants to reschedule it would increase by 1 until that task has been
    scheduled. This would remove the ugliness factor too, your sample code
    would look:

    > + if (need_lock_break()) {
    > + spin_unlock(&mm->page_table_lock);
    > + cpu_relax();
    > + spin_lock(&mm->page_table_lock);
    > + }
    > +

    or a shortcut:

    > + cond_lock_break(&mm->page_table_lock);

    there are two problems with the unconditional lock-break:

     - i'm not sure cpu_relax() guarantees that another CPU can grab this
       lock. It all depends on whether the lock cacheline can bounce over
       to the other CPU faster than cpu_relax() finishes and this CPU
       re-acquires the lock.

     - the latencies will still be higher than on UP, in a hard to predict
       way. Every time the codepath encounters a spinlock it has to take in
       order to exit for a reschedule, the latency gets re-added. Also, the
       now scheduled high-prio task will encounter the latency every time it
       acquires a spinlock - while on UP it has the CPU for itself with no
       interaction from other CPUs.

            Ingo
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Andrew Morton: "Re: Autotune swappiness01"

    Relevant Pages