Re: [PATCH -mm] mm: more likely reclaim MADV_SEQUENTIAL mappings



On Tuesday 22 July 2008 12:36, Rik van Riel wrote:
On Tue, 22 Jul 2008 12:02:26 +1000

Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
I don't actually care what the man page or posix says if it is obviously
silly behaviour. If you want to dispute the technical points of my post,
that would be helpful.

Application writers read the man page and expect MADV_SEQUENTIAL
to do roughly what the name and description imply.

If you think that the kernel should not bother implementing
what the application writers expect, and the application writers
should implement special drop-behind magic for Linux, your
expectations may not be entirely realistic.

The simple fact is that if you already have the knowledge and custom
code for sequentially accessed mappings, then if you know the pages
are not going to be used, there is a *far* better way to do it by
unmapping them than the kernel will ever be able to do itself.

Also, it would be perfectly valid to want a sequentially accessed
mapping but not want to drop the pages early.

What we should do is update the man page now rather than try adding
things to support it.


Consider this: if the app already has dedicated knowledge and
syscalls to know about this big sequential copy, then it should
go about doing it the *right* way and really get performance
improvement. Automatic unmap-behind even if it was perfect still
needs to scan LRU lists to reclaim.

Doing nothing _also_ ends up with the kernel scanning the
LRU lists, once memory fills up.

But we are not doing nothing because we already know and have coded
for the fact that the mapping will be accessed once, sequentially.
Now that we have gone this far, we should actually do it properly and
1. unmap after use, 2. POSIX_FADV_DONTNEED after use. This will give
you much better performance and cache behaviour than any automatic
detection scheme, and it doesn't introduce any regressions for existing
code.


Scanning the LRU lists is a given.

It is not.


All that the patch by Johannes does is make sure the kernel
does the right thing when it runs into an MADV_SEQUENTIAL
page on the inactive_file list: evict the page immediately,
instead of having it pass through the active list and the
inactive list again.

This reduces the number of times that MADV_SEQUENTIAL pages
get scanned from 3 to 1, while protecting the working set
from MADV_SEQUENTIAL pages.

We should update the man page. And seeing as Linux had never preferred
to drop behind *before* now, it is crazy to add such a feature that
we will then have a much harder time to remove, given that it is
clearly suboptimal. Update the man page to sketch the *correct* way to
optimise this type of access.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [TOOL] kprobestest : Kprobe stress test tool
    ... Cleanup all lists ... This tool list up all symbols in the kernel via /proc/kallsyms, ... Finally, the script sorts all 'passed' symbols into 'tested', 'untested', ... # Device Drivers ...
    (Linux-Kernel)
  • Re: [TOOL] kprobestest : Kprobe stress test tool
    ... This tool list up all symbols in the kernel via /proc/kallsyms, ... Finally, the script sorts all 'passed' symbols into 'tested', 'untested', ... each culprit in these lists. ... Hardware name: Deskpro EN Series ...
    (Linux-Kernel)
  • Re: how to start apache22 without ssl
    ... i think you are mistakenly believing everyone in this list are programmers. ... I finally decided to install only the ... WAD - if you don't tell the kernel to load a module via ... The mailing lists are simply fantastic, and they all keep archives (as well as ...
    (freebsd-questions)
  • [RFC] HOWTO do Linux kernel development
    ... Linux kernel development, and where to point other people to. ... If anything in this document becomes out of date, please send in patches ... people on the mailing lists are not lawyers, and you should not rely on ...
    (Linux-Kernel)
  • Re: [patch 0/5] lightweight robust futexes: -V1
    ... The kernel attaches such robust futexes to vmas (via ... What happened if the futex was in anonymous memory ... robustness case - i.e. the named mapping case. ... there's no fundamental difference between them, except that for lists ...
    (Linux-Kernel)