Re: make getdents/readdir POSIX compliant wrt mount-point dirent.d_ino



Theodore Tso wrote:
On Wed, Nov 04, 2009 at 08:29:00PM +0100, Jim Meyering wrote:
One way to accommodate the current automount semantics, is to make fts.c
incur, _for every directory traversed_, the cost of an additional
stat (fstatat, actually) call just in case this happens to be one of
those rare mount points.

I would really rather not pessimize most[*] hierarchy-traversing
command-line tools by up to 17% (though usually far less) in order
to accommodate device-number change semantics that arise
for an automountable directory.

I must be missing something. How do you come up with the 17% penalty
figure? And what does this actually mean in real life?

Actually, it can approach 25%. See below.

stat() in Linux is fast. Really fast.

Sure. But so are chown, rm, du, etc.
And an extra stat can make more than a measurable difference.
On an absolute scale, the difference is not prohibitive,
but wouldn't it be a shame to penalize everyone
for a feature that some of us don't ever use?

A quick benchmark clocks
stat() on my system at 0.814 *microseconds* in the warm cache case,
and if you're stating a directory that you've traversed, odds are
extremely high that it will still be in the cache.

My entire laptop root filesystem has 53,934 directories, so an extra
stat() per directory translates to an extra 43 milliseconds, assuming
I needed to walk my entire root filesystem. It's really hard to see
why kernel developers should get worked up into a lather over that
kind of "performance penalty".

Here's a comparison with fewer than 5000 directories:
Given a directory named z, with 70 subdirs, each containing 70 empty subdirs.
All names are in 0..69. Hot cache. On a tmpfs file system.
linux 2.6.31.1-56.fc12.x86_64

Compare chgrp -R applied to "z", with and without the stat-adding patch:

$ for i in 1 2 3; do for p in prev .; do echo $p; \
env time $p/chgrp -R group2 z; done; done;
prev
0.03user 0.31system 0:00.34elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+235minor)pagefaults 0swaps
.
0.02user 0.39system 0:00.42elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+235minor)pagefaults 0swaps
prev
0.03user 0.30system 0:00.34elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+235minor)pagefaults 0swaps
.
0.04user 0.38system 0:00.43elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+236minor)pagefaults 0swaps
prev
0.03user 0.31system 0:00.34elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+236minor)pagefaults 0swaps
.
0.04user 0.37system 0:00.41elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+235minor)pagefaults 0swaps

That's a 23.5% performance hit. (42-34)/34

Sure, it's not even 1/10th of a second, but remember this is a tiny
hierarchy, and it's not just chgrp, but also find, rm, du, etc. that
are affected. And this is not the sole reason to make a change, but
rather one more reason, in addition to the one that started this thread.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: XFC size trimming ?
    ... By default XFC can take up to 50% of the RAM in a machine. ... If you do a Logical block write IO to a device, then the cache memory ... ** back to a VBN and thus plays it safe by invalidating all cached ... int i, stat; ...
    (comp.os.vms)
  • Re: ftp perl script problem..
    ... the underscore represents a special filehandle used to save a ... This is described in more detail in 'perldoc -f -X'. ... but it only works once you call stat on a file which I don't see the OP ... which values are then put in the cache ...
    (comp.lang.perl.misc)
  • Re: Terrorists to attend Ahmadinejad inauguration
    ... begoo mash-ghasem angoshtet kone biAd beeroon bAbAmjAn... ... Stat. ... Prev by Date: ...
    (soc.culture.iranian)
  • BKIR offering back door
    ... sheneedam hatA sagam meekone. ... Stat. ... Prev by Date: ...
    (soc.culture.iranian)
  • Re: 2006 Predictions (what do you predict) ?
    ... be hameen khiAl bAsh Abji. ... Kirismas. ... Stat. ... Prev by Date: ...
    (soc.culture.iranian)