Re: If not readdir() then what?
- From: Theodore Tso <tytso@xxxxxxx>
- Date: Sat, 7 Apr 2007 16:36:33 -0400
On Sat, Apr 07, 2007 at 09:57:32AM -0700, Ulrich Drepper wrote:
In their closed chambers (well, workshops,
http://lwn.net/Articles/226351/), the filesystem developers complain
about readdir. I fully appreciate the difficulties. But what I fail
to see so far is any proposal for an alternative interface.
The phase to get new functionality included in the next revision of
POSIX is over. But that does not mean we should not try to get some
sensible new implementation in place. There is, for example, the
"High End Computing Extensions Working Group" (the guys who showed up
here with their statlite and readdirplus proposals). This is an
official working group at the OpenGroup which can produce a document
which can be the basis of inclusion in the next revision and become a
OpenGroup specification earlier than that.
So, if anybody has a proposal for better interfaces let's hear them.
"Now" is a very good time to start working on this.
The problem isn't as much readdir() as it is telldir()/seekdir(),
which fundamentally assumes that the directory is a linear file. JFS
for example was forced to implement an entire separate btree whose
only purpose was to record telldir cookies.
With ext3 and htree we return the directories in hash tree sort order.
This sacrifices some performance with readdir/stat workloads unless
the userspace program does a readdir/qsort on inode number/stat
replacement. In addition, there is the risk of hash collusions
because of the pathetically small size of the telldir cookie (32
bits). If this happens, the readdir() guarantees of only returning
each directory entry at most once are compromised. We use a keyed
hash with a per-filesystem superblock secret to prevent someone from
deliberately finding a hash collision and proving that we're not POSIX
compliant in the edge cases. Cheasy, yes, but only loser programs
should be using telldir/seekdir anyway. :-)
So how do we solve this problem? I can think of two solutions:
1) Deprecate telldir/seekdir() altogether. Relatively few progams use
this functionality, and it is highly questionable how useful it is,
anyway. If you use telldir/seekdir and keep the cookie for a long
time, even the POSIX-provided guarantees about files that are created
and deleted between the telldir() and seekdir() points in time makes
its utility highly dubious.
2) If application programs must have telldir/seekdir, than expand the
size of the cookie from 32-bits to a minimum of 128 bits, and
preferably larger --- say 512 bits, to accomodate systems that might
be using 512-bit variant of SHA-2. So something like this?
/* TELLDIR_COOKIE_SIZE must be >= 64 */
#define TELLDIR_COOKIE_SIZE 64
typedef unsigned char ltelldir_cookie_t[TELLDIR_COOKIE_SIZE];
int ltelldir(int fd, ltelldircookie_t *cookie);
int lseekdir(int fd, ltelldircookie_t *cookie);
My personal preference would be #1, though. :-)
- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: If not readdir() then what?
- From: J. Bruce Fields
- Re: If not readdir() then what?
- From: Jan Engelhardt
- Re: If not readdir() then what?
- From: Christoph Hellwig
- Re: If not readdir() then what?
- References:
- If not readdir() then what?
- From: Ulrich Drepper
- If not readdir() then what?
- Prev by Date: Re: init's children list is long and slows reaping children.
- Next by Date: [PATCH] CRIS: Remove code related to pre-2.2 kernel.
- Previous by thread: If not readdir() then what?
- Next by thread: Re: If not readdir() then what?
- Index(es):
Relevant Pages
|