[PATCH] fix f_version optimization for get_tgid_list

From: Manfred Spraul (manfred_at_colorfullife.com)
Date: 08/31/04

  • Next message: Pavel Machek: "Re: silent semantic changes with reiser4"
    Date:	Tue, 31 Aug 2004 22:26:35 +0200
    To: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
    
    
    

    Hi,

    the kernel contains an optimization that skips the linked list walk in
    get_tgid_list for the common case of sequential accesses.
    Unfortunately the optimization is buggy (missing NULL pointer check for
    the result of find_task_by_pid) and broken (actually - broken twice: the
    tgid value that is stored in f_version is always 0 because tgid is
    overwritten when the string is created and additionally the common case
    is not filldir < 0, it's running out of nr_tgids).

    The attached patch fixes these bugs.

    Roger: could you run your 100k processes test against a kernel with this
    fix applied? I'm just interested how much it helps. One obvious result
    are fewer getdents64 syscalls: instead of returning 20 entries (480
    bytes) the new code fills the user space buffer completely (42
    entries/syscall).

    Andrew, could you add the patch to your -mm tree?

    Signed-Off-By: Manfred Spraul <manfred@colorfullife.com>

    
    

    --- 2.6/fs/proc/base.c 2004-08-20 19:59:19.000000000 +0200
    +++ build-2.6/fs/proc/base.c 2004-08-31 21:42:28.000000000 +0200
    @@ -1689,7 +1689,7 @@ static int get_tgid_list(int index, unsi
             p = NULL;
             if (version) {
                     p = find_task_by_pid(version);
    - if (!thread_group_leader(p))
    + if (p && !thread_group_leader(p))
                             p = NULL;
             }
     
    @@ -1752,6 +1752,7 @@ int proc_pid_readdir(struct file * filp,
             char buf[PROC_NUMBUF];
             unsigned int nr = filp->f_pos - FIRST_PROCESS_ENTRY;
             unsigned int nr_tgids, i;
    + int next_tgid;
     
             if (!nr) {
                     ino_t ino = fake_ino(0,PROC_TGID_INO);
    @@ -1761,26 +1762,45 @@ int proc_pid_readdir(struct file * filp,
                     nr++;
             }
     
    - /*
    - * f_version caches the last tgid which was returned from readdir
    + /* f_version caches the tgid value that the last readdir call couldn't
    + * return. lseek aka telldir automagically resets f_version to 0.
              */
    - nr_tgids = get_tgid_list(nr, filp->f_version, tgid_array);
    -
    - for (i = 0; i < nr_tgids; i++) {
    - int tgid = tgid_array[i];
    - ino_t ino = fake_ino(tgid,PROC_TGID_INO);
    - unsigned long j = PROC_NUMBUF;
    -
    - do
    - buf[--j] = '0' + (tgid % 10);
    - while ((tgid /= 10) != 0);
    -
    - if (filldir(dirent, buf+j, PROC_NUMBUF-j, filp->f_pos, ino, DT_DIR) < 0) {
    - filp->f_version = tgid;
    + next_tgid = filp->f_version;
    + filp->f_version = 0;
    + for (;;) {
    + nr_tgids = get_tgid_list(nr, next_tgid, tgid_array);
    + if (!nr_tgids) {
    + /* no more entries ! */
                             break;
                     }
    - filp->f_pos++;
    + next_tgid = 0;
    +
    + /* do not use the last found pid, reserve it for next_tgid */
    + if (nr_tgids == PROC_MAXPIDS) {
    + nr_tgids--;
    + next_tgid = tgid_array[nr_tgids];
    + }
    +
    + for (i=0;i<nr_tgids;i++) {
    + int tgid = tgid_array[i];
    + ino_t ino = fake_ino(tgid,PROC_TGID_INO);
    + unsigned long j = PROC_NUMBUF;
    +
    + do
    + buf[--j] = '0' + (tgid % 10);
    + while ((tgid /= 10) != 0);
    +
    + if (filldir(dirent, buf+j, PROC_NUMBUF-j, filp->f_pos, ino, DT_DIR) < 0) {
    + /* returning this tgid failed, save it as the first
    + * pid for the next readir call */
    + filp->f_version = tgid_array[i];
    + goto out;
    + }
    + filp->f_pos++;
    + nr++;
    + }
             }
    +out:
             return 0;
     }
     

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Pavel Machek: "Re: silent semantic changes with reiser4"

    Relevant Pages

    • Re: optimization of the system by recompilation
      ... clean system with GENERIC kernel and binary-installed world, ... this specific task runs slower as result of the optimization. ... The not-optimized Athlon-based system: ... available timing options: USE_TOD HZ=128 ...
      (freebsd-performance)
    • Re: 2.7 thoughts: common well-architected object model
      ... Coming from the kernel an outline of a common ... A core component model has to be defined by the kernel ... Linux to scale as well as it does. ...
      (Linux-Kernel)
    • Re: -mm merge plans for 2.6.23
      ... allnoconfig kernel, a allmodconfig kernel, a allyesconfig kernel and ... With swap prefetch app response time is not laggy when I ... this is a scenario that is common to me and one where swap ...
      (Linux-Kernel)
    • Re: Where to find the interrupt vector table?
      ... commonly used peripherals at a common place, ... set up the SDRam over in kernel also, I mean SdRam clock refresh rate, ... configuration files are somewhere else. ...
      (comp.os.linux.embedded)
    • Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3
      ... optimization by Ingo does nothing else but allowing AIO to probe file ... cache - if data there to go with fast path. ... Servers want to never, ever block. ... I/O in parallel so that the kernel can optimize which order to handle ...
      (Linux-Kernel)