Re: comparing algorithm perf.



"Thierry" == Thierry MARTIN <thierry-martin@xxxxxxxxxxx> writes:

Thierry> I see two problems with "time (1)" solution:

Thierry> 1. You can only monitor a "full program", and I don't
Thierry> want initialisations stuff to be taken into acount in my
Thierry> comparison. I want to compare two equivalent loops that
Thierry> are part of two distinct programs. Both algos may take
Thierry> different time to initialise, I don't care about it.

My approach is to use times(2) to get the user/system time values at
the spots you want to inspect, and use a logging library
(e.g. log4cxx) to log the value to a log file. A post-processing
script can then read these "timestamps" from the log file and
calculate the time used in various parts of the program. You can even
make it structured. e.g.

Log file:
utime=5: begin program
utime=1234: begin algorithm_1
utime=5678: algorithm_1 checkpoint_A
utime=6678: algorithm_1 checkpoint_B
utime=6778: algorithm_1 checkpoint_B.1
utime=6878: algorithm_1 checkpoint_B.2
utime=6978: algorithm_1 checkpoint_B.1
utime=7078: algorithm_1 checkpoint_B.2
...
utime=17078: begin algorithm_1 subalgorithm_1a
...
utime=27078: end algorithm_1 subalgorithm_1a
...
utime=37078: begin algorithm_1 subalgorithm_1b
...
utime=47078: end algorithm_1 subalgorithm_1b
...
utime=57078: end algorithm_1
...
utime=67078: end program

A simple perl script can read such a log file and show a tree or
spread*** with the times spent on algorithm 1 and the
sub-algorithms. You can also checkpoint loop bodies to keep track of
the time spent on the individual iterations.



Thierry> 2. if you launch time twice with the same program and
Thierry> same data, you get two different results, which lets me
Thierry> think it depends on the load of your machin.

"real" and "sys" times may different. But "user" time is usually very
stable.

But isn't "user" time, i.e. no. of CPU cycles that your algorithm
consumes, that you should be really measuring? If you want to measure
"real" time, then you have to ask yourself: why are you timing your
algorithm on a machine under non-trivial load? What do you want to
compare with? What conclusions are you trying to draw? Are they
valid with your experimental settings? Are you even doing a
_controlled_ experiment? etc. There is no definite answer to these
questions. It depends on what you want to investigate. e.g. if all
you want is to find the O() time complexity, then it's useless to
time(1) the algorithm.


--
Lee Sau Dan 李守敦 ~{@nJX6X~}

E-mail: danlee@xxxxxxxxxxxxxxxxxxxxxxxxxx
Home page: http://www.informatik.uni-freiburg.de/~danlee
.


Quantcast