Re: Swappiness vs. mmap() and interactive response
- From: Elladan <elladan@xxxxxxxxxx>
- Date: Wed, 29 Apr 2009 21:14:39 -0700
On Tue, Apr 28, 2009 at 11:34:55PM -0700, Andrew Morton wrote:
On Wed, 29 Apr 2009 14:51:07 +0900 (JST) KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
Hi
On Tue, Apr 28, 2009 at 05:09:16PM +0800, Wu Fengguang wrote:
The semi-drop-behind is a great idea for the desktop - to put just
accessed pages to end of LRU. However I'm still afraid it vastly
changes the caching behavior and wont work well as expected in server
workloads - shall we verify this?
Back to this big-cp-hurts-responsibility issue. Background write
requests can easily pass the io scheduler's obstacles and fill up
the disk queue. Now every read request will have to wait 10+ writes
- leading to 10x slow down of major page faults.
I reach this conclusion based on recent CFQ code reviews. Will bring up
a queue depth limiting patch for more exercises..
We can muck with the I/O scheduler, but another thing to consider is
whether the VM should be more aggressively throttling writes in this
case; it sounds like the big cp in this case may be dirtying pages so
aggressively that it's driving other (more useful) pages out of the
page cache --- if the target disk is slower than the source disk (for
example, backing up a SATA primary disk to a USB-attached backup disk)
no amount of drop-behind is going to help the situation.
So that leaves three areas for exploration:
* Write-throttling
* Drop-behind
* background writes pushing aside foreground reads
Hmm, note that although the original bug reporter is running Ubuntu
Jaunty, and hence 2.6.28, this problem is going to get *worse* with
2.6.30, since we have the ext3 data=ordered latency fixes which will
write out the any journal activity, and worse, any synchornous commits
(i.e., caused by fsync) will force out all of the dirty pages with
WRITE_SYNC priority. So with a heavy load, I suspect this is going to
be more of a VM issue, and especially figuring out how to tune more
aggressive write-throttling may be key here.
firstly, I'd like to report my reproduce test result.
test environment: no lvm, copy ext3 to ext3 (not mv), no change swappiness,
CFQ is used, userland is Fedora10, mmotm(2.6.30-rc1 + mm patch),
CPU opteronx4, mem 4G
mouse move lag: not happend
window move lag: not happend
Mapped page decrease rapidly: not happend (I guess, these page stay in
active list on my system)
page fault large latency: happend (latencytop display >200ms)
hm. The last two observations appear to be inconsistent.
Elladan, have you checked to see whether the Mapped: number in
/proc/meminfo is decreasing?
Yes, Mapped decreases while a large file copy is ongoing. It increases again
if I use the GUI.
Then, I don't doubt vm replacement logic now.
but I need more investigate.
I plan to try following thing today and tommorow.
- XFS
- LVM
- another io scheduler (thanks Ted, good view point)
- Rik's new patch
It's not clear that we know what's happening yet, is it? It's such a
gross problem that you'd think that even our testing would have found
it by now :(
Elladan, do you know if earlier kernels (2.6.26 or thereabouts) had
this severe a problem?
No, I don't know about older kernels.
Also, just to add a bit: I'm having some difficulty reproducing the extremely
severe latency I was seeing right off. It's not difficult for me to reproduce
latencies that are painful, but not on the order of 10 second response. Maybe
3 or 4 seconds at most. I didn't have a stopwatch handy originally though, so
it's somewhat subjective, but I wonder if there's some element of the load that
I'm missing.
I had a theory about why this might be: my original repro was copying data
which I believe had been written once, but never read. Plus, I was using
relatime. However, on second thought this doesn't work -- there's only 8000
files, and a re-test with atime turned on isn't much different than with
relatime.
The other possibility is that there was some other background IO load spike,
which I didn't notice at the time. I don't know what that would be though,
unless it was one of gnome's indexing jobs (I didn't see one, though).
-Elladan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: Swappiness vs. mmap() and interactive response
- From: Andrew Morton
- Re: Swappiness vs. mmap() and interactive response
- References:
- Re: Swappiness vs. mmap() and interactive response
- From: Wu Fengguang
- Re: Swappiness vs. mmap() and interactive response
- From: Theodore Tso
- Re: Swappiness vs. mmap() and interactive response
- From: KOSAKI Motohiro
- Re: Swappiness vs. mmap() and interactive response
- From: Andrew Morton
- Re: Swappiness vs. mmap() and interactive response
- Prev by Date: Re: Next April 28: boot failure on PowerPC with SLQB
- Next by Date: [PATCH] blktrace: swap arg name "from" and "to" of blk_add_trace_remap
- Previous by thread: Re: Swappiness vs. mmap() and interactive response
- Next by thread: Re: Swappiness vs. mmap() and interactive response
- Index(es):
Relevant Pages
|
Loading