Re: question?

From: James Wilkinson (james_at_westexe.demon.co.uk)
Date: 10/01/04

  • Next message: Jim Higson: "Re: Can RedHat buy the old Swing from AOL for gcj ?"
    Date: Fri, 1 Oct 2004 17:33:38 +0100
    To: fedora-list@redhat.com
    
    

    Dan Ladd wrote:
    > Yeah this is the only email address i could find because I had a
    > question about the fedora project. I didn't konw where to direct it.
    > So here goes... I was wondering about the WinFS file management
    > system. If something like that (meaning the built in database
    > system) was going to be included in fedora or other RedHat operating
    > systems or if something like is already in an operating system with
    > RedHat? It seems like it is old technology because the AS/400 or now
    > the iSeries has the built in database system to locate physical files
    > that are stored in logical files. If you could direct me to where i
    > can have that answered that would be awesome.

    Oohh. *BIG* question.

    Background: Unix invented much of the "everything is a file, stored in a
    file tree, and the OS just sees a normal file as a collection of bits"
    philosophy that is now more or less standard. Meanwhile, the relational
    database became the standard way of storing smaller, structured data.

    *Lots* of computer scientists have been wondering whether this split
    between ways of storing data is ideal. So there has been a lot of work
    done looking at the best ways to store general-purpose files in a
    database.

    It turns out that this is a Hard Problem. Storing files as opaque,
    binary objects in a database isn't a problem, a lot of modern
    filesystems effectively do this. The question is whether we can take
    anything else from the database world.

    Here you should understand that there is no agreed vision of how things
    should work. This is the main point I want to make. So you will have to
    work out what you want from a database filesystem, and see what provides
    it.

    The two big problems come under the heading of writing and reading.
    Writing is relatively easy, since you can define the problem: It Would Be
    Nice If Linux allowed multiple updates to one file or to many files to be
    treated as a transaction. Even there, there is the unfortunate detail
    of getting transactions to span filesystems.

    Reading is more of a problem. Many file formats keep internal metadata
    (author, image size, artist, etc.), and there is a demand to keep more
    data against files (e.g. Access Control Lists). Many people think that
    there should be a better way of finding all recordings of Vaughan
    Williams' works than finding all MP3s, all OGGs, all Real Media, etc.
    and running format-specific query programs against each file. (One still
    has a problem if some-one entered "Old Hundredth, arr. RWV", but that
    can be ignored in the first few versions...) Maybe icons for a file
    should be stored against the file.

    This is the metadata problem, or rather, series of problems. One is,
    simply, how do you present metadata under Unix-like systems? Solaris
    has a special system call and program to access the metadata: Hans
    Reiser is proposing to allow you to access each file as a directory
    with the metadata available underneath (obviously, this isn't practical
    with real directories).

    The other big problem is how much metadata should move around with a
    file. It's obvious that you want to be able to export files in an
    existing format, which will drop any metadata that isn't already in the
    format. (You still need to support existing filesystems, for example on
    CDs).

    Then when you're copying things around, some metadata (user to last
    modify) should change, others (user to first modify) shouldn't. This
    means that something is going to have to know a *lot* about the way that
    metadata works, which means you are going to have a lot of per-filetype
    programming and/or a lot of rigidity.

    The main contender to "solve" both of these problems is the Reiser 4
    filesystem. This is still very new, and has a number of problems with it.

     * It's very new, not fully debugged, and has a number of security and
       reliability problems.

     * It stores metadata to a file by treating the file as a directory, and
       putting metadata as pseudo-files in that directory. That changes the
       way users and programs think about files, and will invalidate a lot
       of assumputions.

    See http://lwn.net/Articles/14035/ .

    Other contenders include user-space plugins to the Gnome or KDE virtual
    filesystems. These can be reasonably taught "this is an MP3, this is an
    XML document", and retrieve the meta-data on demand.

    It still isn't clear how best to make this visible to non-technical end-
    users. It's largely those people who *aren't* happy with shell scripting
    who would most benefit for easy ways to look for files with Vaughan
    Williams recordings. (Those who can will probably have the sense to put
    RVW in the pathname somewhere, and can use custom tools).

    On top of this, maybe some files should be word-indexed. It doesn't make
    sense doing this for Ogg files, though: Microsoft's Find Fast has long
    done this in userspace with a separate database, and this does seem much
    better than putting the suppot in the kernel.

    I don't know much about OS/400: it always sounded as though they
    implemented the database first, and then created the entire OS and
    related applications around the database. They had the advantage that
    everything knew it was going to be working with a database, and
    progams on OS/400 probably really want a database backend anyway.

    Linux doesn't have that, and is a much more general purpose OS.

    I've also come across http://lwn.net/Articles/56923/ : you might want to
    read that, too.

    Note WinFS itself appears to be delayed until the end of the decade.

    Sorry for the length of this e-mail: there's more I could say, but won't.

    James.

    -- 
    E-mail address: james | Examiner: How does an AC motor start?
    @westexe.demon.co.uk  | Student: vrrrrrrrrrrRrRRRRRRR...
                          | Examiner: Stop! Stop!
                          | Student: RRRRRRRmmmmm.
    -- 
    fedora-list mailing list
    fedora-list@redhat.com
    To unsubscribe: http://www.redhat.com/mailman/listinfo/fedora-list
    

  • Next message: Jim Higson: "Re: Can RedHat buy the old Swing from AOL for gcj ?"

    Relevant Pages

    • Re: Question regarding to store file system metadata in database
      ... database, because the time used to read filelist and do statare ... considering the possibility of storing metadata in database. ... kernel boundary, I am expecting a moderate performance impact. ... the kernel to a userspace database process, ...
      (Linux-Kernel)
    • Re: Modelling objects with variable number of properties in an RDBMS
      ... >>> metadata during the early development stages, ... >>> then be used to build the final database system, ... >>> EAV model. ... > structured relational data model. ...
      (comp.databases.theory)
    • Re: Cool things you can do with Delphi
      ... >> To have code completion I am using MyGeneration to make classes for ... it's a matter of completing the metadata ... change in database to be reflected in code. ... TDataSet descendants to objects from DataExplorer and when an object from DE is ...
      (borland.public.delphi.non-technical)
    • Re: Metadata versioning
      ... > growing database applications as clients evolve with versions. ... to store the metadata version information. ... I created a set of classes to completely describe the metadata or schema ... catalog tables, and schema reading routines, I can move between metadata ...
      (borland.public.delphi.non-technical)
    • Re: reiser4 plugins
      ... programs if you want to go the xattr route and you really want to edit ... >> support from the applications. ... most Linux filesystems have xattrs and acls (stored as ... of concurrent users (or versions of metadata at least). ...
      (Linux-Kernel)