SMP clusters, CC-Clusters and friends ...

From: Karim Yaghmour (karim_at_opersys.com)
Date: 09/05/03

  • Next message: rwhron_at_earthlink.net: "Re: 2.6.0-test4-mm5 dbench stuck in D state"
    Date:	Fri, 05 Sep 2003 00:10:55 -0400
    To: linux-kernel <linux-kernel@vger.kernel.org>
    
    

    Having spent quite some time in the summer of 2002 figuring out many of the
    various practical issues of implementing Linux SMP clusters, I was quite happy
    to see some actual discussion on the topic on LKML. Unfortunately the discussion
    has centered mostly on philosophical issues, and I wasn't interested very much
    in those: though I believe the kernel can be made to scale, and has done so
    successfully in the past, I think that an SMP cluster will always scale to a
    greater number of processors than the kernel can, regardless of its state of
    development (i.e. if the kernel scales well to 256 CPUs, then an SMP cluster
    using said kernel will scale to N*256, N being the number of cells/nodes to
    which the upper-layer SSI software can scale). That's just for the scalability
    argument, but there are other reasons why you'd want to have multiple
    independent images running side-by-side, as others have pointed out ...

    Personally, I'm much more interested in the how-do-we-do-this aspect of
    discussion. Though Steven Cole pointed out the paper I wrote in July 2002
    (thanks Steven), there's been little technical discussion around it.
    Needless to say that I'd welcome any feedback anyone may have on the ideas
    I put forth in the paper.

    To recap, the architecture I'm suggesting has the following advantages:
    - No changes to kernel's virtual memory code
    - No changes to kernel's scheduler
    - No changes to kernel's lock granularity
    - Minimal low-level changes to kernel code
    - Reuse of many existing software components
    - Short-term accessibility

    Unfortunately, I've been unable to put any time on this because of my
    involvement in other projects and the fact that I have to pay the rent ;)
    If anyone wants to take this forward on his own or if someone wants to fund
    work on this, I'd be glad be to help.

    Fortunately, I think the amount of work required to get a functional SMP
    cluster seems to decrease with time. The work already done on kexec and
    NUMA, for example, is sure to be of help in forking off multiple instances of
    the same kernel. Actually, I had a very good discussion about the use of
    kexec in the scheme I'm suggesting with Eric Biederman at the last OLS.

    If you're still reading this and are interested in the actual implementation
    of SMP clusters, have a look at what I'm suggesting and let me know what
    you think:
    http://www.opersys.com/ftp/pub/Adeos/practical-smp-clusters.ps
    http://www.opersys.com/ftp/pub/Adeos/practical-smp-clusters.pdf
    http://www.opersys.com/adeos/practical-smp-clusters/

    Cheers,

    Karim

    -- 
    Author, Speaker, Developer, Consultant
    Pushing Embedded and Real-Time Linux Systems Beyond the Limits
    http://www.opersys.com || karim@opersys.com || 514-812-4145
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at  http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at  http://www.tux.org/lkml/
    

  • Next message: rwhron_at_earthlink.net: "Re: 2.6.0-test4-mm5 dbench stuck in D state"