Re: "Read my lips: no more merges" - aka Linux 2.6.14-rc1

From: Eric Dumazet (dada1_at_cosmosbay.com)
Date: 09/15/05

  • Next message: Nix: "Re: Automatic Configuration of a Kernel"
    Date:	Thu, 15 Sep 2005 22:41:15 +0200
    To: Benjamin LaHaise <bcrl@kvack.org>
    
    
    

    Benjamin LaHaise a écrit :
    > On Tue, Sep 13, 2005 at 09:04:32AM +0200, Eric Dumazet wrote:
    >
    >>I wish a process param could allow open() to take any free fd available,
    >>not the lowest one. One can always use fcntl(fd, F_DUPFD, slot) to move a
    >>fd on a specific high slot and always keep the 64 first fd slots free to
    >>speedup the kernel part at open()/dup()/socket() time.
    >
    >
    > The overhead is easy to avoid by making use of dup2() and close() to keep
    > the lowest file descriptors in the table free, allowing open() and socket()
    > to always return 3 or 4.

    Yes, this is what I described :) Maybe this was not clear.

    >
    > Alternatively, the kernel could track available file descriptors using a
    > tree to efficiently insert freed slots into an ordered list of free
    > regions (something similar to the avl tree used in vmas). Is it worth
    > doing?

    Well no, since a user app can manage itself this part if it happens to be
    performance critical.

    Sample of a user land lib : Each time a new fd is returned by
    open()/socket()/pipe()/accept()... the thread should call

    fd = fdcache_dupfd(fd);

    And close the file using fdcache_closefd(fd) instead of close(fd);

    Eric

    
    

    /*
     * Unix kernel has an expensive get_unused_fd() function :
     * This is because semantics of Unix mandates that a open()/pipe()/socket()/ call always returns the lowest fd, not a random one.
     * Linux use a linear scan of a table of bits.
     * A program handling 1.000.000 files scans about 128 KB of ram, with a spinlock held : No other thread can get a fd.
     *
     * The trick is to use this library to make sure 64 low fds are available, so that the standard unix functions
     * dont have to scan a lot of fd before finding a free one.
     * And remap them using fcntl(F_DUPFD) at precise slots we manage ourselfs.
     */
    #include <pthread.h>
    #include <fcntl.h>
    #include <stdlib.h>
    #include <unistd.h>

    # define MAXFDS 1500000

    struct {
            pthread_mutex_t lock;
            unsigned int cache_fd;
            unsigned int next_alloc;
            unsigned int *cache_tab;
            } fdd;

    void fdcache_init()
    {
            pthread_mutex_init(&fdd.lock, NULL);
            fdd.cache_tab = calloc(MAXFDS, sizeof(unsigned int));
            fdd.next_alloc = 64;
    }

    int fdcache_dupfd(int fd)
    {
            int ret;
            pthread_mutex_lock(&fdd.lock);
            if (fdd.cache_fd == 0)
                    fdd.cache_fd = fdd.next_alloc++;
            ret = fcntl(fd, F_DUPFD, fdd.cache_fd);
            if (ret != -1) {
                    fdd.cache_fd = fdd.cache_tab[ret];
                    pthread_mutex_unlock(&fdd.lock);
                    close(fd);
                    return ret;
            }
            else {
                    pthread_mutex_unlock(&fdd.lock);
                    return fd;
            }
    }

    void fdcache_closefd(int fd)
    {
            if (fd == -1)
                    return;

            close(fd);

            pthread_mutex_lock(&fdd.lock);
            fdd.cache_tab[fd] = fdd.cache_fd;
            fdd.cache_fd = fd;
            pthread_mutex_unlock(&fdd.lock);
    }

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Nix: "Re: Automatic Configuration of a Kernel"

    Relevant Pages

    • SUMMARY: void main(int argc, char *argv[]) question
      ... The consensus is that the use of void on main is wrong (The book ... return and int. ... >In trying to compile some old test Unix C programs... ... Why do those old UNIX programs used void ... ...
      (SunManagers)
    • Re: Implementing my own memcpy
      ... > dupobj(void *obj, size_t objsize) ... EINVAL is not a Standard C macro (the UNIX and POSIX standards do define ... A plain malloc() will suffice ...
      (comp.lang.c)
    • void main(int argc, char *argv[]) question
      ... In trying to compile some old test Unix C programs... ... void main(int argc, char *argv) ...
      (SunManagers)
    • Re: Infinite Loops and Explicit Exits
      ... the system calls that operate on them - and on Unix systems stdin, ... stdout, and stderr are associated with descriptors 0, 1, and 2 at ... file descriptors are indispensible to Unix programming. ...
      (comp.lang.cobol)
    • Re: "Read my lips: no more merges" - aka Linux 2.6.14-rc1
      ... > I wish a process param could allow opento take any free fd available, ... the lowest file descriptors in the table free, ... regions (something similar to the avl tree used in vmas). ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)