Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement

From: Martin J. Bligh (mbligh_at_aracnet.com)
Date: 10/04/04

  • Next message: Benjamin Herrenschmidt: "Re: [PATCH] -mm swsusp: copy_page is harmfull"
    Date:	Sun, 03 Oct 2004 16:47:10 -0700
    To: Paul Jackson <pj@sgi.com>
    
    

    --Paul Jackson <pj@sgi.com> wrote (on Sunday, October 03, 2004 09:02:09 -0700):

    > Martin wrote:
    >> The way cpusets uses the current cpus_allowed mechanism is, to me, the most
    >> worrying thing about it. Frankly, the cpus_allowed thing is kind of tacked
    >> onto the existing scheduler, and not at all integrated into it, and doesn't
    >> work well if you use it heavily (eg bind all the processes to a few CPUs,
    >> and watch the rest of the system kill itself).
    >
    > True. One detail of what you say I'm unclear on -- how will the rest of
    > the system kill itself? Why wouldn't the unemployed CPUs just idle
    > around, waiting for something to do?

    I think last time I looked they just sat there saying:

    Rebalance!
    Ooooh, CPU 3 over there looks heavily loaded, I'll steal something.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    Humpf. I give up.
    Rebalance!
    Ooooh, CPU 3 over there looks heavily loaded, I'll steal something.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    Humpf. I give up.
    Rebalance!
    Ooooh, CPU 3 over there looks heavily loaded, I'll steal something.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    That one. Try to migrate. Oops, no cpus_allowed bars me.
    Humpf. I give up.
    ... ad infinitum.

    Desperately boring, and rather ineffective.

    > As I recall, Ingo added task->cpus_allowed for the Tux in-kernel web
    > server a few years back, and I piggy backed the cpuset stuff on that, to
    > keep my patch size small.
    >
    > Likely your same concerns apply to the task->mems_allowed field that
    > I added, in the same fashion, in my cpuset patch of recent.

    Mmm, I'm less concerned about that one, or at least I can't specifically
    see how it breaks.
     
    > We need a mechanism that the cpuset apparatus respects that maps each
    > CPU to a sched_domain, exactly one sched_domain for any given CPU at any
    > point in time, regardless of which task it is considering running at the
    > moment. Somewhat like dual-channeled disks, having more than one
    > sched_domain apply at the same time to a given CPU leads to confusions
    > best avoided unless desparately needed.

    Agreed. The cpus_allowed mechanism doesn't seem well suited to heavy use
    anyway (I think John Hawkes had problems with it too). That's not your
    fault ... but I'm not convinced it's a good foundation to be building
    further things on either ;-)

    M.

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Benjamin Herrenschmidt: "Re: [PATCH] -mm swsusp: copy_page is harmfull"

    Relevant Pages