Re: [00/41] Large Blocksize Support V7 (adds memmap support)



On Fri, Sep 14, 2007 at 06:48:55AM +1000, Nick Piggin wrote:
On Thursday 13 September 2007 12:01, Nick Piggin wrote:
On Thursday 13 September 2007 23:03, David Chinner wrote:
Then just do operations on directories with lots of files in them
(tens of thousands). Every directory operation will require at
least one vmap in this situation - e.g. a traversal will result in
lots and lots of blocks being read that will require vmap() for every
directory block read from disk and an unmap almost immediately
afterwards when the reference is dropped....

Ah, wow, thanks: I can reproduce it.

OK, the vunmap batching code wipes your TLB flushing and IPIs off
the table. Diffstat below, but the TLB portions are here (besides that
_everything_ is probably lower due to less TLB misses caused by the
TLB flushing):

-170 -99.4% sn2_send_IPI
-343 -100.0% sn_send_IPI_phys
-17911 -99.9% smp_call_function

Total performance went up by 30% on a 64-way system (248 seconds to
172 seconds to run parallel finds over different huge directories).

Good start, Nick ;)


23012 54790.5% _read_lock
9427 329.0% __get_vm_area_node
5792 0.0% __find_vm_area
1590 53000.0% __vunmap
....

_read_lock? I though vmap() and vunmap() only took the vmlist_lock in
write mode.....

Next I have some patches to scale the vmap locks and data
structures better, but they're not quite ready yet. This looks like it
should result in a further speedup of several times when combined
with the TLB flushing reductions here...

Sounds promising - when you have patches that work, send them my
way and I'll run some tests on them.

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [00/41] Large Blocksize Support V7 (adds memmap support)
    ... lots and lots of blocks being read that will require vmap() for every ... the vunmap batching code wipes your TLB flushing and IPIs off ... Total performance went up by 30% on a 64-way system (248 seconds to ...
    (Linux-Kernel)
  • Re: [rfc][patch] mm: vmap rewrite
    ... XEN and PAT and such do not like deferred TLB flushing. ... Basically it goes through the lazy mappings, and does a global kernel TLB ... testing if a given page has vmap aliases, so that we need only do the ... flag" territory though... ...
    (Linux-Kernel)
  • [rfc][patch 1/3] mm: vmap rewrite
    ... I plan to submit my vmap rewrite to -mm. ... a global kernel TLB flush, which on most architectures is a broadcast IPI ... +static void vunmap_page_range(unsigned long addr, ... extern void *vmalloc_user; ...
    (Linux-Kernel)
  • Re: [patch 1/5] avoid tlb gather restarts.
    ... Your cleanups 2-4 look good, ... Moving the tlb gather ... You think you're just moving the finish/gather to where they're ... in TLB flushing, have forever stalled me. ...
    (Linux-Kernel)
  • Re: [rfc][patch] mm: vmap rewrite
    ... Rewrite the vmap allocator to use rbtrees and lazy tlb flushing, ... XEN and PAT and such do not like deferred TLB flushing. ... For initial process construction that could be deferred, but creating mappings on a live process could get fairly expensive as a result. ... The ideal interface for me would be a way of testing if a given page has vmap aliases, so that we need only do the unmap if really necessary. ...
    (Linux-Kernel)