Re: [profile] amortize atomic hit count increments

From: Andrea Arcangeli (andrea_at_novell.com)
Date: 09/14/04

  • Next message: Pierre Ossman: "Re: HP/Compaq (Winbond) SD/MMC reader supported"
    Date:	Tue, 14 Sep 2004 18:05:31 +0200
    To: William Lee Irwin III <wli@holomorphy.com>
    
    

    On Tue, Sep 14, 2004 at 08:51:03AM -0700, William Lee Irwin III wrote:
    > On Mon, Sep 13, 2004 at 09:47:48PM -0700, William Lee Irwin III wrote:
    > >> timer interrupt, usually at boot. The following patch attempts to
    > >> amortize the atomic operations done on the profile buffer to address
    > >> this stability concern. This patch has nothing to do with performance;
    >
    > On Tue, Sep 14, 2004 at 01:34:19PM +0200, Andrea Arcangeli wrote:
    > > isn't it *much* simpler and much more efficient to just have a per-cpu
    > > idle function? I seriously doubt you'll get simultaneous collisions on
    > > anything but the 'halt' instruction in the idle function.
    >
    > Sampling the profile buffer at regular intervals shows far less than
    > 256 distinct functions hit in 1s intervals even with all cpus busy. As
    > for whether that would be sufficient, that will have to be answered by
    > those who reported the bug. I suppose to test whether things besides
    > idling do cause this problem, one would boot with a restricted
    > prof_cpu_mask, load all cpus on the machine, set the prof_cpu_mask to
    > unrestricted, and see if it livelocks before the load terminates.

    It probably worth to measure it. The real bottleneck happens when all
    cpus tries to get an exclusive lock on the same cacheline at the *same*
    time. 1 second is a pretty long time, if there's no contention of the
    cacheline, things are normally ok.

    this is basically the same issue we had with RCU since all timers fired
    at the same wall clock time, and all of them tried to change bits in the
    same cacheline at the same time, that is a workload that collapse a
    512-way machine ;). The profile timer is no different.

    Simply removing the idle time accounting would fix it, however this
    cripple down functionality a little bit, but it'll be a very good way to
    test if my theory is correct, or if you truly need some per-cpu logic in
    the profiler.

    You could also fake it, have a per-cpu counter only for the current->pid
    case, and then once somebody reads /proc/profile, you flush the total
    per-cpu count to the counter in the buffer that corresponds to the EIP
    of the idle func.

    Before dedicidng I'd suggest to have a look and see how the below patch
    compares to your approch in performance terms.

    --- sles/arch/ia64/kernel/time.c.~1~ 2004-08-25 02:47:33.000000000 +0200
    +++ sles/arch/ia64/kernel/time.c 2004-09-14 18:01:39.792182008 +0200
    @@ -206,6 +206,9 @@ ia64_do_profile (struct pt_regs * regs)
             if (!prof_buffer)
                     return;
     
    + if (!current->pid)
    + return;
    +
             ip = instruction_pointer(regs);
             /* Conserve space in histogram by encoding slot bits in address
              * bits 2 and 3 rather than bits 0 and 1.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Pierre Ossman: "Re: HP/Compaq (Winbond) SD/MMC reader supported"

    Relevant Pages

    • [patch 05/10] Move profiling to instrumentation
      ... Manages a direct-mapped profile hit count buffer, ... * profile hits required for the accuracy of reported profile hits ... * The open-addressed hashtables are indexed by profile buffer slot ... * and hold the number of pending hits to that profile buffer slot on ...
      (Linux-Kernel)
    • cpufreqd: ondemand
      ... is full speed if the system is idle. ... Why does the same cpufreqd configuration behave differently on ... Debian than on Mandriva? ... # [Profile] sample: ...
      (Debian-User)
    • Re: 2.6.5-rc3-mm4
      ... > The profile is interesting: ... > - There's a ton of idle time there. ... It is difficult to tell how idle each one is due to lack of total ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: Unbloating the kernel, was: :mem=16MB laptop testing
      ... you could take profiles, provided you have enough RAM for the profile ... to steal the profile buffer from e820 regions not otherwise used for ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: Quote of the day (from non-riders)
      ... I'm trying!:o There's still too much chance of me falling off any ... second for me to idle on the street. ... jimothy's Profile: http://www.unicyclist.com/profile/14516 ...
      (rec.sport.unicycling)