Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19

From: Gerrit Huizenga (gh_at_us.ibm.com)
Date: 11/02/05

  • Next message: Duncan Sands: "Re: [PATCH] Eagle and ADI 930 usb adsl modem driver"
    To: Ingo Molnar <mingo@elte.hu>
    Date:	Tue, 01 Nov 2005 23:46:35 -0800
    
    

    On Wed, 02 Nov 2005 08:19:43 +0100, Ingo Molnar wrote:
    >
    > * Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
    >
    > > My own target is NUMA node hotplug, what NUMA node hotplug want is
    > > - [remove the range of memory] For this approach, admin should define
    > > *core* node and removable node. Memory on removable node is removable.
    > > Dividing area into removable and not-removable is needed, because
    > > we cannot allocate any kernel's object on removable area.
    > > Removable area should be 100% removable. Customer can know the limitation
    > > before using.
    >
    > that's a perfectly fine method, and is quite similar to the 'separate
    > zone' approach Nick mentioned too. It is also easily understandable for
    > users/customers.
    >
    > under such an approach, things become easier as well: if you have zones
    > you can to restrict (no kernel pinned-down allocations, no mlock-ed
    > pages, etc.), there's no need for any 'fragmentation avoidance' patches!
    > Basically all of that RAM becomes instantly removable (with some small
    > complications). That's the beauty of the separate-zones approach. It is
    > also a limitation: no kernel allocations, so all the highmem-alike
    > restrictions apply to it too.
    >
    > but what is a dangerous fallacy is that we will be able to support hot
    > memory unplug of generic kernel RAM in any reliable way!
    >
    > you really have to look at this from the conceptual angle: 'can an
    > approach ever lead to a satisfactory result'? If the answer is 'no',
    > then we _must not_ add a 90% solution that we _know_ will never be a
    > 100% solution.
    >
    > for the separate-removable-zones approach we see the end of the tunnel.
    > Separate zones are well-understood.
    >
    > generic unpluggable kernel RAM _will not work_.

    Actually, it will. Well, depending on terminology.

    There are two usage models here - those which intend to remove physical
    elements and those where the kernel returnss management of its virtualized
    "physical" memory to a hypervisor. In the latter case, a hypervisor
    already maintains a virtual map of the memory and the OS needs to release
    virtualized "physical" memory. I think you are referring to RAM here as
    the physical component; however these same defrag patches help where a
    hypervisor is maintaining the real physical memory below the operating
    system and the OS is managing a virtualized "physical" memory.

    On pSeries hardware or with Xen, a client OS can return chunks of memory
    to the hypervisor. That memory needs to be returned in chunks of the
    size that the hypervisor normally manages/maintains. But long ranges
    of physical contiguity are not required. Just shorter ranges, depending
    on what the hypervisor maintains, need to be returned from the OS to
    the hypervisor.

    In other words, if we can return 1 MB chunks, the hypervisor can hand
    out those 1 MB chunks to other domains/partitions. So, if we can return
    500 1 MB chunks from a 2 GB OS instance, we can add 500 MB dyanamically
    to another OS image.

    This happens to be a *very* satisfactory answer for virtualized environments.

    The other answer, which is harder, is to return (free) entire large physical
    chunks, e.g. the size of the full memory of a node, allowing a node to be
    dynamically removed (or a DIMM/SIMM/etc.).

    So, people are working towards two distinct solutions, both of which
    require us to do a better job of defragmenting memory (or avoiding
    fragementation in the first place).

    gerrit
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Duncan Sands: "Re: [PATCH] Eagle and ADI 930 usb adsl modem driver"

    Relevant Pages

    • Re: random writing access to a file in Python
      ... the size of available memory for the sorting operation (making it ... possible to work on larger chunks of data in memory) has less impact ... In the second run the size of processed chunks between reading/writing was in the order of up to tenths of Megabytes, where in the first run in order of up to hundreds Megabytes. ... decision about the size of buffers for merge sorting the chunks into the final file, so that they all fit into the 300 MByte of used memory ...
      (comp.lang.python)
    • Xbox 360 Hypervisor Privilege Escalation Vulnerability
      ... Xbox 360 Hypervisor Privilege Escalation Vulnerability ... inject data into non-privileged memory areas, ... to the syscall dispatcher, as illustrated below. ...
      (Bugtraq)
    • [Full-disclosure] Xbox 360 Hypervisor Privilege Escalation Vulnerability
      ... Xbox 360 Hypervisor Privilege Escalation Vulnerability ... inject data into non-privileged memory areas, ... to the syscall dispatcher, as illustrated below. ...
      (Full-Disclosure)
    • Re: Efficient algorithm for finding duplicates in 56-bit number
      ... |) processor time) it would take to sort millions of those values ... and that'a a heck of a big IF (if it fits in memory). ... | chunk fits in memory) and then merges the chunks. ...
      (comp.programming)
    • Re: python memory blow out
      ... >> allocates memory in 256 kB chunks and never releases it. ... Large chunks are returned to the platform C when they become ... > to release memory back to the operating system once the 256kb chunk is ... the patch may return such a 256KB chunk to the platform C ...
      (comp.lang.python)