Re: make getdents/readdir POSIX compliant wrt mount-point dirent.d_ino
- From: Jim Meyering <jim@xxxxxxxxxxxx>
- Date: Fri, 06 Nov 2009 00:28:56 +0100
Theodore Tso wrote:
On Wed, Nov 04, 2009 at 08:29:00PM +0100, Jim Meyering wrote:
One way to accommodate the current automount semantics, is to make fts.c
incur, _for every directory traversed_, the cost of an additional
stat (fstatat, actually) call just in case this happens to be one of
those rare mount points.
I would really rather not pessimize most[*] hierarchy-traversing
command-line tools by up to 17% (though usually far less) in order
to accommodate device-number change semantics that arise
for an automountable directory.
I must be missing something. How do you come up with the 17% penalty
figure? And what does this actually mean in real life?
Actually, it can approach 25%. See below.
stat() in Linux is fast. Really fast.
Sure. But so are chown, rm, du, etc.
And an extra stat can make more than a measurable difference.
On an absolute scale, the difference is not prohibitive,
but wouldn't it be a shame to penalize everyone
for a feature that some of us don't ever use?
A quick benchmark clocks
stat() on my system at 0.814 *microseconds* in the warm cache case,
and if you're stating a directory that you've traversed, odds are
extremely high that it will still be in the cache.
My entire laptop root filesystem has 53,934 directories, so an extra
stat() per directory translates to an extra 43 milliseconds, assuming
I needed to walk my entire root filesystem. It's really hard to see
why kernel developers should get worked up into a lather over that
kind of "performance penalty".
Here's a comparison with fewer than 5000 directories:
Given a directory named z, with 70 subdirs, each containing 70 empty subdirs.
All names are in 0..69. Hot cache. On a tmpfs file system.
linux 2.6.31.1-56.fc12.x86_64
Compare chgrp -R applied to "z", with and without the stat-adding patch:
$ for i in 1 2 3; do for p in prev .; do echo $p; \
env time $p/chgrp -R group2 z; done; done;
prev
0.03user 0.31system 0:00.34elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+235minor)pagefaults 0swaps
.
0.02user 0.39system 0:00.42elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+235minor)pagefaults 0swaps
prev
0.03user 0.30system 0:00.34elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+235minor)pagefaults 0swaps
.
0.04user 0.38system 0:00.43elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+236minor)pagefaults 0swaps
prev
0.03user 0.31system 0:00.34elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+236minor)pagefaults 0swaps
.
0.04user 0.37system 0:00.41elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+235minor)pagefaults 0swaps
That's a 23.5% performance hit. (42-34)/34
Sure, it's not even 1/10th of a second, but remember this is a tiny
hierarchy, and it's not just chgrp, but also find, rm, du, etc. that
are affected. And this is not the sole reason to make a change, but
rather one more reason, in addition to the one that started this thread.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- References:
- Re: make getdents/readdir POSIX compliant wrt mount-point dirent.d_ino
- From: Jim Meyering
- Re: make getdents/readdir POSIX compliant wrt mount-point dirent.d_ino
- From: Theodore Tso
- Re: make getdents/readdir POSIX compliant wrt mount-point dirent.d_ino
- Prev by Date: Re: [00/46] 2.6.31.5-stable review
- Next by Date: Re: Intermittent early panic in try_to_wake_up
- Previous by thread: Re: make getdents/readdir POSIX compliant wrt mount-point dirent.d_ino
- Next by thread: Re: make getdents/readdir POSIX compliant wrt mount-point dirent.d_ino
- Index(es):
Relevant Pages
|