Re: [PATCH 0/7] dlm: overview

From: Lars Marowsky-Bree (lmb_at_suse.de)
Date: 04/27/05

  • Next message: Rafael J. Wysocki: "Re: [BUG] 2.6.12-rc3: unkillable java process in TASK_RUNNING on AMD64"
    Date:	Wed, 27 Apr 2005 15:56:35 +0200
    To: Daniel Phillips <phillips@istop.com>
    
    

    On 2005-04-26T01:30:16, Daniel Phillips <phillips@istop.com> wrote:

    > > Now that we have two (or three) options with actual users, now is
    > > the right time to finally come up with sane and useful abstractions.
    > > This is great.
    > Great thought, but it won't work unless you actually read them all,
    > which I hope is what you're proposing.

    Sure. As time permits ;-)

    > I'm a little skeptical about the chance of fitting an 11-parameter function
    > call into a generic kernel plug-in framework. Are those the exact same 11
    > parameters that God intended?

    An 11-parameter function, frankly, more often than not indicates that
    the interface is wrong. I know it's inherited from VMS, which is a
    perfectly legitimate reason, but I assume it might get cleaned / broken
    up in the future.

    > While it would be great to share a single dlm between gfs and ocfs2 - maybe
    > Lustre too - my crystal ball says that that laudable goal is unlikely to be
    > achieved in the near future, whereas there isn't much choice but to sort out
    > a common membership framework right now.

    Oh, sure. I just like to keep a long term vision in mind, an idea I'd
    think you approve of. ;-)

    Also I didn't say that they should necessarily _share_ a DLM; I assume
    there'll be more than one, just like we have more than one filesystem.
    But, can this be mapped to a common subset of features which an
    application/user/filesystem can assume to be present in every DLM,
    accessed via a common API? This does not preclude the option that one
    DLM will perform substantially better for some user than another one, or
    that one DLM takes advantage of some specific hardware feature which
    makes it run a magnitude faster on the z-Series for example.

    As I said, this is not something to do right now, but something to keep
    in mind going forward, but to keep in mind at every step.

    > As far as I can see, only cluster membership wants or needs a common
    > framework. And I'm not sure that any of that even needs to be in-kernel.

    Well, right now nobody proposes the membership to be in-kernel. What I'd
    like to see though is a common way of _feeding_ the membership to a
    given kernel component, and being able to query the kind of syntax &
    semantics it expects.

    Note that I said "given kernel component", because I assume from the
    start that a node might be part of several overlapping clusters. The
    membership I feed to the GFS DLM might not be the same I feed to OCFS2
    for another mount.

    Questions which need to be settled, or which the API at least needs to
    export so we know what is expected from us:

    - How do the node ids look like? Are they sparse integers, continuous
      ints, uuids, IPv4 or IPv6 address of the 'primary' IP of a node,
      hostnames...?

    - How are the communication links configured? How to tell it which
      interfaces to use for IP, for example?

    - How do we actually deliver the membership events -
      echo "current node list" >/sys/cluster/gfs/membership
      or...?
      
    - What kind of semantics are expected: Can we deliver the membership
      events as they come, do we need to introduce suspend/resume barriers
      etc?

    - How to security credentials play into this, and where are they
      enforced - so that a user-space app doesn't mess with kernel locks?

    Maybe initially we'll end up with those being "exported" in
    Documentation/{OCFS2,GFS}-DLM/ files, but ultimately it'd be nice if
    user-space could auto-discover them and do the right thing w/a minimum
    amount of configuration.

    Or maybe these will be abstracted by user-space wrapper libraries, and
    everybody does in the kernel what they deem best.

    It's just something which needs to be answered and decided ;-)

    Sincerely,
        Lars Marowsky-Brée <lmb@suse.de>

    -- 
    High Availability & Clustering
    SUSE Labs, Research and Development
    SUSE LINUX Products GmbH - A Novell Business
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at  http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at  http://www.tux.org/lkml/
    

  • Next message: Rafael J. Wysocki: "Re: [BUG] 2.6.12-rc3: unkillable java process in TASK_RUNNING on AMD64"

    Relevant Pages

    • Re: [PATCH 1b/7] dlm: core locking
      ... because defining the properties of the membership ... the delivery of the suspend/membership distribution/resume ... > trying to explain is that the dlm is in a different, ... into the global cluster resource management architecture, ...
      (Linux-Kernel)
    • Re: When is a Quaker a Quaker?
      ... ...Yet a Catholic is not a particular sort of Christian ... whose identity is tied to membership in the Roman Catholic ... Nevertheless, common or general usage *is* very important, just ...
      (soc.religion.quaker)
    • Re: resume vs CV
      ... membership to being a Cub Scout den leader to membership in civic ... with a list of certain kinds of hobbies and non-work interests. ... while certain kinds of sports or physical activities are ... In recent years it's been common to list "cooking" as an ...
      (alt.usage.english)
    • Re: [PATCH 1b/7] dlm: core locking
      ... > I have always thought that an distributed application could use the DLM ... > until after nodes that are no longer a member of the cluster are safely ... depending on whether a lock happens to master locally or not, ... node membership age, ...
      (Linux-Kernel)
    • Re: Copying a whole table
      ... >> refer to the table named 2004xxxxxx will have to be modified to refer to ... >> records something common from year to year). ... By including the membership date, you can return records for any one ... Stop thinking spreadsheet. ...
      (microsoft.public.access.gettingstarted)