Re: [Perfctr-devel] Re: Enabling RDPMC in user space by default

From: Andi Kleen (ak_at_suse.de)
Date: 11/29/05

  • Next message: Arjan van de Ven: "Re: Linux 2.6.15-rc3 - gcc-4.0.2 compile error"
    Date:	Tue, 29 Nov 2005 19:38:48 +0100
    To: John Reiser <jreiser@BitWagon.com>
    
    

    On Tue, Nov 29, 2005 at 10:29:47AM -0800, John Reiser wrote:
    > Andi Kleen wrote:
    > > I think it's also a useful convention - RDTSC is becomming more and more
    > > useless and you cannot expect user applications who just want to
    > > measure some cycles to rely on ever changing instable or non existing
    > > performance counter APIs.
    >
    > Users are even more unhappy with ever-changing ABIs -- such as the
    > kernel taking away RDTSC.

    Nobody is talking about taking it away. But it's becomming
    more and more useless because there are many situations
    where it does unexpected things.

    (it's not synchronized over CPUs,
    on modern Intel CPUs it always measures the fastest P state even
    though you might be running slower, on other CPUs when
    you want to measure time it actually changes with P states etc.etc.)

    The performance counter has a much clearer defintion - it's always
    cycles are executed by the CPU and it doesn't even pretend
    to be a usable timer.
    >
    > RDTSC+perfctr [Pettersson] still is the fastest way for user-mode code
    > to count something that is highly correlated with both "billable"
    > CPU time and "code quality" for a fixed task. With a little care

    Actually it's wrong - at least on Intel CPUs RDPMC is faster
    than RDTSC because it doesn't synchronize.

    > RDTSC is close enough to monotonic that I find it very useful.

    You tested on a very limited set of platforms and setups then.
    So far you were either lucky or just didn't notice the problems yet.

    About the only reasonable usage was for custom hacks to measure
    cycles, but with all the ongoing changes in its definition
    I believe these users will be happier with rdpmc 0 once
    it's enabled (and oprofile and other users be taught
    to keep their fingers away)

    People who use it for timing, not measurement, directly are just wrong and
    misguided.

    -Andi
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Arjan van de Ven: "Re: Linux 2.6.15-rc3 - gcc-4.0.2 compile error"

    Relevant Pages

    • Re: [PATCH] [2.6] [2/2] hlist: remove IFs from hlist functions
      ... A full cache miss is extremly costly on a modern Gigahertz+ CPU because ... An CMP is extremly cheap (a few cycles at worst), ... CPUs is costly because it adds load to the interconnect. ...
      (Linux-Kernel)
    • Re: Lockless 63-bit Counter
      ... >> DCAS is possible, but this counter will be incrementing somewhere ... FSB Xeon 3.2) cycles on a Xeon platform. ... the main slowdown of your program will be the CPUs fighting ... Michael Brown ...
      (comp.lang.asm.x86)
    • Re: GetTickCount() performance
      ... syscall takes a lot of CPU cycles, definitely on the wrong side of 500. ... CPUs, tens of microseconds, QueryPerformanceCounter() may already be used. ...
      (microsoft.public.win32.programmer.kernel)
    • Re: 6502 trashing memory cycles...
      ... Cycle Address Bus Data Bus External Operation Internal Operation ... Yes, memory cycles 2,3 and 6 are useless, wasted. ... It wastes very little and uses both phases of the clock too. ...
      (comp.sys.apple2)
    • Re: 6502 trashing memory cycles...
      ... Cycle Address Bus Data Bus External Operation Internal Operation ... Stack Pointer to 01FE ... Yes, memory cycles 2,3 and 6 are useless, wasted. ...
      (comp.sys.apple2)