Re: objrmap-core-1 (rmap removal for file mappings to avoid 4:4 in <=16G machines)

From: Andrea Arcangeli (andrea_at_suse.de)
Date: 03/09/04

  • Next message: Tim Schmielau: "[PATCH][RFC] fix BSD accounting (w/ long-term perspective ;-)"
    Date:	Tue, 9 Mar 2004 00:02:47 +0100
    To: Andrew Morton <akpm@osdl.org>
    
    

    On Mon, Mar 08, 2004 at 01:23:05PM -0800, Andrew Morton wrote:
    > There is an architectural concern: we're now treating anonymous pages
    > differently from file-backed ones. But we already do that in some places

    I'm working on that.

    > Other issues are how it will play with remap_file_pages(), and how it
    > impacts Ingo's work to permit remap_file_pages() to set page permissions on
    > a per-page basis. This change provides large performance improvements to

    in the current form it should be using pte_chains still for nonlinear
    vmas, see the function that pretends to convert the page to be like
    anonymous memory (which simply means to use pte_chains for the reverse
    mappings). I admit I didn't focus much on that part though, I trust
    Dave on that ;), since I want to drop it.

    What I want to do with the nonlinear vmas is to scan all the ptes in
    every nonlinear vma, so I don't have to allocate the pte_chain and the
    swapping procedure will simply be more cpu hungry under nonlinear vmas.
    I'm not interested to provide optimal performance in swapping nonlinear
    vmas, I prefer the fast path to be as fast as possible and without
    memory overhead. nonlinear vmas are supposed to speedup the workload. If
    one needs to swap efficiently, the vmas will do it (by carrying some
    memory overhead, like pte_chains would carry too).

    > UML, making it more viable for various virtual-hosting applications. I
    > don't immediately see any reason why objrmap should kill that off, but if
    > it does we're in the position of trading off UML virtual server performance
    > against monster highmem viability. That's less clear.

    I still don't see objrmap as a highmem related things, by side effect of
    being more efficient avoids the needs of 4:4 too, but that's just a side
    effect.

    the potential additional cpu consumtion with many vmas from the same
    file at different offsets is something I'm slightly concerned about too
    but my priority is to optimize the fast path, and the slowdown is not
    something I worry about too much since it should be still a lot better
    than the pagetable walk of 2.4 where one had to throw away the whole
    address space before one could free a shared page (and still it was far
    from being cpu bound). I'm also using objrmap in 2.4 to actually swap
    lots of shm, some of which having tons of vmas for the same file, and
    while the objrmap function remains at the top of the profiling, at least
    the swap_out loop is gone and avoid to trow away the whole address
    space. I mean it's still a lot more efficient than having to scan all
    the ptes in the system to find the right page, or like 2.4 stock is
    doing to throw away all the address space to swap 4k.

    and you know well that 2.6 swaps slower (or anyways not faster) than
    2.4, see the posts on linux-mm or the complains from the lowmem users,
    there are various bits involved into the swapping and paging that are
    more important than saving cpu during swapping, which is normally an I/O
    bound thing.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Tim Schmielau: "[PATCH][RFC] fix BSD accounting (w/ long-term perspective ;-)"

    Relevant Pages