Re: [patch 2/2] cpusets: add interleave_over_allowed option



On Thu, 25 Oct 2007, Christoph Lameter wrote:

More interactions between cpusets and memory policies. We have to be
careful here to keep clean semantics.


I agree.

Isnt it a bit surprising for an application that has set up a custom
MPOL_INTERLEAVE policy if the nodes suddenly change because of a cpuset or
mems_allowed change?


Every MPOL_INTERLEAVE policy is a custom policy that the application has
setup. If you don't use cpusets at all, the nodemask you pass to
set_mempolicy() with MPOL_INTERLEAVE is static and won't change without
the application's knowledge. It has full control over the nodemask that
it desires to interleave over.

The problem occurs when you add cpusets into the mix and permit the
allowed nodes to change without knowledge to the application. Right now,
a simple remap is done so if the cardinality of the set of nodes
decreases, you're interleaving over a smaller number of nodes. If the
cardinality increases, your interleaved nodemask isn't expanded. That's
the problem that we're facing. The remap itself is troublesome because it
doesn't take into account the user's desire for a custom nodemask to be
used anyway; it could remap an interleaved policy over several nodes that
will already be contended with one another.

Normally, MPOL_INTERLEAVE is used to reduce bus contention to improve the
throughput of the application. If you remap the number of nodes to
interleave over, which is currently how it's done when mems_allowed
changes, you could actually be increasing latency because you're
interleaving over the same bus.

This isn't a memory policy problem because all it does is effect a
specific policy over a set of nodes. With my change, cpusets are required
to update the interleaved nodemask if the user specified that they desire
the feature with interleave_over_allowed. Cpusets are, after all, the
ones that changed the mems_allowed in the first place and invalidated our
custom interleave policy. We simply can't make inferences about what we
should do, so we allow the creator of the cpuset to specify it for us. So
the proper place to modify an interleaved policy is in cpusets and not
mempolicy itself.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag
    ... that stores the the nodemask that the user passed when he or she created ... MPOL_F_STATIC_NODES, which is passed with any mempolicy mode, the user's ... This happens whenever the policy is rebound, ... Shmem areas between cpusets with disjoint mems_allowed seems like an error ...
    (Linux-Kernel)
  • [patch 2/2] cpusets: add interleave_over_allowed option
    ... Adds a new 'interleave_over_allowed' option to cpusets. ... When a task with an MPOL_INTERLEAVE memory policy is attached to a cpuset ... the interleaved nodemask becomes the cpuset's ... This allows applications to specify that they want to interleave over all ...
    (Linux-Kernel)
  • Re: [patch 2/2] cpusets: add interleave_over_allowed option
    ... The remap itself is troublesome because it ... interleave over, which is currently how it's done when mems_allowed ... With my change, cpusets are required ... the proper place to modify an interleaved policy is in cpusets and not ...
    (Linux-Kernel)
  • Re: [patch 2/2] cpusets: add interleave_over_allowed option
    ... Cpusets deal ... with cpus and memory, they don't have anything to do with affinity to ... allocated on a node with the best affinity to my device. ... change my access to that node, I'm still using an MPOL_PREFERRED policy ...
    (Linux-Kernel)
  • Re: [patch 2/2] cpusets: add interleave_over_allowed option
    ... The remap itself is troublesome because it ... interleave over, which is currently how it's done when mems_allowed ... With my change, cpusets are required ... the proper place to modify an interleaved policy is in cpusets and not ...
    (Linux-Kernel)