Re: boot time node and memory limit options

From: Dave Hansen (haveblue_at_us.ibm.com)
Date: 03/17/04

  • Next message: Petr Vandrovec: "Re: vmware on linux 2.6.4"
    To: "Martin J. Bligh" <mbligh@aracnet.com>
    Date:	Wed, 17 Mar 2004 09:09:45 -0800
    
    
    

    On Wed, 2004-03-17 at 08:36, Martin J. Bligh wrote:
    > > The patch I posted was arrived at after some people suggested an
    > > architecture independent patch. My patch basically allocates memory
    > > from the bootmem allocator before mem_init calls free_all_bootmem_core.
    > > It's architecture independent. If the real goal is to limit physical
    > > memory before the bootmem allocator is initialized, then my current
    > > patch doesn't accomplish this.
    >
    > Don't we have the same arch dependant issue with the current mem= anyway?
    > Can we come up with something where the arch code calls back into a generic
    > function to derive limitations, and thereby at least get the parsing done
    > in a common routine for consistency? There aren't *that* many NUMA arches
    > to change anyway ...

    The problem with doing it in generic code is that it has to happen
    _after_ the memory layout is discovered. It's a mess to reconstruct all
    of the necessary information about where holes stop and start, at least
    from the current information that we store. Then, you have to go track
    down any information that might have "leaked" into the arch code before
    you parsed the mem=, which includes all of the {min,max)_{high,low}_pfn
    variable. I prefer to just take care of it at its source where NUMA
    information is read out of the hardware.

    Every arch has its own way of describing its layout. Some use "chunks"
    and others like ppc64 use LMB (logical memory blocks). If each arch was
    willing to store their memory layout information in a generic way, then
    we might have a shot at doing a generic mem= or a NUMA version.

    I coded this up a few days ago to see if I could replace the x440 SRAT
    chunks with it. I never got around to actually doing that part, but
    something like this is what we need to do *layout* manipulation in an
    architecture-agnostic way.

    I started coding this before I thought *too* much about it. What I want
    is a way to get rid of all of the crap that each architecture (and
    subarch) have to store their physical memory layout. On normal x86 we
    have the e820 and the EFI tables and on Summit/x440, we have yet another
    way to do it.

    What I'd like to do is present a standard way for all of these
    architectures to store the information that they need to record at boot
    time, plus make something flexible enough that we can use it for stuff
    at runtime when hotplug memory is involved.

    The code I'd like to see go away from boot-time is anything that deals
    with arch-specific structures like the e820, functions like
    lmb_end_of_DRAM(), or any code that deals with zholes. I'd like to get
    it to a point where we can do a mostly arch-independent mem=.

    So, here's a little bit of (now userspace) code that implements a very
    simple way to track physical memory areas.

    stuff that sucks:
    - long type names/indiscriminate use of u64
    - "section" is on my brain from CONFIG_NONLINEAR, probably don't want
      to use that name again
    - Doesn't coalesce adjacent sections with identical attributes, only
      extends existing ones.
    - could sort arrays instead of using lists for speed/space
    - can leave "UNDEF" holes
    - can't add new sections spanning 2 old ones

    -- dave

    
    
    

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/




  • Next message: Petr Vandrovec: "Re: vmware on linux 2.6.4"

    Relevant Pages

    • Re: [PATCH 2.6.31] ehca: Tolerate dynamic memory operations and huge pages
      ... But your patch description says: ... > This patch implements toleration of dynamic memory operations.... ... its global memory region in the system, the memory layout must be considered. ... I'll put some more details into the changelog, ...
      (Linux-Kernel)
    • Fourth edition of Computer Architecture: A Quantitative Approach
      ... Computer Architecture: ... Increased coverage on achieving parallelism with multiprocessors. ... Advanced Techniques for Instruction Delivery and Speculation ... Distributed Shared Memory and Directory-Based Coherence ...
      (comp.arch)
    • Fourth edition of Computer Architecture: A Quantitative Approach
      ... Computer Architecture: ... Increased coverage on achieving parallelism with multiprocessors. ... Advanced Techniques for Instruction Delivery and Speculation ... Distributed Shared Memory and Directory-Based Coherence ...
      (comp.arch)
    • Re: remove zero_page (was Re: -mm merge plans for 2.6.24)
      ... then I'd certainly accept the patch. ... (basically -- if the app cares about memory or cache footprint and is using ... And indeed this cacheline bouncing has shown up on large SGI systems. ... Inserting a ZERO_PAGE for anonymous read faults appears to be a false ...
      (Linux-Kernel)
    • =?iso-8859-15?Q?Re:_[RFC]_BadRAM_still_not_ready_for_inclusion_=3F_(wa?= =?iso-8859-
      ... maybe this patch is just something very special, having many pro's but also con's - so this also could be one reason why it exists for so long outside mainline. ... BadRAM let's you tell the kernel to skip certain regions of ram, ... forever, once it becomes a supported feature, for the benefit of the few ... people who can't or won't replace bad memory. ...
      (Linux-Kernel)