Re: measuring clock cycles per second
- From: Jacek Dziedzic <jacek.dziedzic__no--spam__@xxxxxxxxx>
- Date: Mon, 01 Dec 2008 19:59:17 +0100
Rainer Weikusat wrote:
You already asserted this in your last posting. But this amounts to
'it works because I say it does'.
No, not really. I could've backed this up with statistical reasoning, Bernoulli trials come to mind, but thought that would be overkill. In fact, I am not trying to convince you, I'm trying to find a reason why you would think it's useless.
Trying a little though experiment:
Let's assume that a program only executes a single loop and this loop
calls invokes two subroutines. The average execution time of
subroutine #1 is 1/4 of the sampling interval, the average execution
time of #2 3/4. This means the program spends 1/4 of its time in
subroutine one, yet if the profiling timer was started 'close' to the
start of the loop, the instruction pointer should basically always be
somewhere in #2 when the value is recorded.
What's wrong with this example?
It's the assumption that what happens "on average" happens "in every instance" that you, it seems to me, made above.
Even if the average execution time of #1 is exactly 1/4 of the sampling interval and the average execution time of #2 is exactly 3/4 of the sampling interval, then, provided the standard deviation of the respective probability distributions of times is nonzero, the total time taken by the loop will, in general, be unequal to the sampling interval, even if by a mere 1%. Accumulation of those little differences will, as time progresses, quickly rid #2 of its preference for gathering sampling ticks.
You also seem to conveniently
a) forget that there are other schedulable entities which can, apparently stochastically, cause the program in question to be preemptied anywhere inside the loop, and as it runs again, the offset of the sampling timer will change things drastically, because of things like stalls, cache invalidation, page faults,...
b) assume an extraordinary view that _all_ loops that are timed are somehow exact multiples of sampling time (which, imho, follows from the adjective "useless" that you have used). Or was this example only to demonstrate something else?
So I tend towards "realizing the limits of profiling with a small
resolution", yet the claim that the technique is "useless except on
ancient hardware" eludes me.
The assumption that gprof actually provides useful output at some
point in the past is just a complimentary assumption of mine, because
I have never seen this happen ever since I first encountered the
program on a 25 Mhz processor.
My view is rather that "gprof actually provides useful output today, even with 100/s resolution". I see it happen once in a while when I look for bottlenecks in code that runs on 2.4 GHz processor. The function in which they occur, it seems I can isolate it pretty easily having gprof output. It's the one up there in the list, with >90% of time taken. You optimize it and bang, the running time is slashed. That's opposite of useless, happening every now and then. Rather than the poor resolution, I find the fact that measurement itself leads to changes in timing (via stalls, interference with cache, etc) more difficult to work with, especially when -g is used (not sure about the reason).
- J.
.
- Follow-Ups:
- Re: measuring clock cycles per second
- From: Rainer Weikusat
- Re: measuring clock cycles per second
- References:
- Re: measuring clock cycles per second
- From: Jacek Dziedzic
- Re: measuring clock cycles per second
- From: Rainer Weikusat
- Re: measuring clock cycles per second
- From: Jacek Dziedzic
- Re: measuring clock cycles per second
- From: Rainer Weikusat
- Re: measuring clock cycles per second
- From: Jacek Dziedzic
- Re: measuring clock cycles per second
- From: Rainer Weikusat
- Re: measuring clock cycles per second
- Prev by Date: Re: how do i get executable's elf header using dl_iterate_phdr?
- Next by Date: Re: measuring clock cycles per second
- Previous by thread: Re: measuring clock cycles per second
- Next by thread: Re: measuring clock cycles per second
- Index(es):
Relevant Pages
|