$version 1.4$

Perl's motto is "There Is More Than One Way To Do It".  Some
ways are easier to read, some are faster, and some are
just plain comprehensible.  Eventually, I will want to know how long it
takes for my Perl poetry to execute, and I will use the
Benchmark module to find out. Benchmark.pm comes with the
standard Perl distribution and is written completely in
Perl, so you should be able to use it right away.

TROUBLE WITH TIME()

Before I start start with the Benchmark module, 
let's think about what I must do to time an event.
I need to know when the event started and when the event
ended.  If I know those I can determine the difference
between them which I call the duration of the event.  There
is actually quite a bit to consider.

If I wanted to time one of my scripts, I could look at my
watch when the script started, and again when it finished. 
If my scripts takes more than several seconds I actually
might be able to do that.  I do not need to rely on my
watch, though, since Perl already provides a way to do this
with the built-in time() function which returns the system
time.  I can record the system time twice and take the
difference of the two:

	my $start_time = time;
	
	# My code here
	
	my $end_time   = time;
	
	my $difference = $end_time - $start_time;
	
	print "My code took $difference seconds\n";
	
Since time() returns seconds, this method can only record
times and differences with a resolution of seconds, which
might be too coarse a granularity for the really fast code
that I have written.  Also, remember that my script is not
the only process running. The CPU thinks about other things
before it finishes my script, so the stopwatch approach
doesn't tell me how long the CPU actually spent on *my*
script.  On days when the CPU is more loaded, the time
to execute might even be longer (or shorter on less loaded
days) XXX.

BETTER RESOLUTION WITH TIMES()

The built-in times() function returns a list of the user
time, system time, children's user time, and children's
system time with a finer granularity than time() (it relies
on time(2) - so see your system's documentation for details.
I have no idea what the non-unix ports use), and only
records the time the spent on my process. I can use the
same technique that I used before:

	my @start_times = times;

	# My code here
	
	my @end_time    = times;
	
	my @differences = map { $end_time[$_] - $start_time[$_] } 
					(0..$#start_time);
	
	my $difference = join ', ', @differences;
	print "My code took ($difference) seconds\n";
	
But computers are pretty fast and that code
might run a lot faster than the smallest time that I can
measure, even with times(). To get around this I can run my
code several times and measure the time it takes to run it
several times then take the average.  This makes the
situation much more complicated.  Not only I would need to
make a loop to run the code several times while timing it,
but I would need to figure out how the addition of the loop
affected the time. As I promised before, you won't have to
worry about all of this if you use the Benchmark module.

THE BENCHMARK MODULE

Now I want to rewrite my previous examples using the
Benchmark module.  To record a time, I construct a
Benchmark object.  The constructor creates a list of
the times returned by time() and times(), although I do not
need to worry about that since I just want to use the
abstract interface.

	use Benchmark;
	
	my $start_time = new Benchmark;
	
	# my really slow code here
	my @array = (1 .. 1000000);
	foreach my $element ( @array ) { $_ += $element }
	
	my $end_time   = new Benchmark;
	
I also need a way to determine the time difference, which I
can do with the timediff() function:

	my $difference = timediff($end_time, $start_time);
	
which returns another Benchmark object.  When I want to see
the times that I have measured, I use the timestr() method:

	print "It took ", timestr($difference), "\n";
	
This function provides several ways to print the time by
using optional FORMAT and STYLE parameters which are
explained in the manpage. After running this code I get:

	It took 40 secs (17.53 usr  6.39 sys = 23.92 cpu)

The first number, 40 secs, is the real time it took to
execute.  The module takes this directly from time().  The
next numbers are the values from times() giving the user and
system times, which, when summed, give the total CPU time.

I can also measure the time it takes to do several
iterations of the code by using the timeit() method, which
takes either a CODE reference (a reference to an anonymous
subroutine) or a string to eval().  The
function returns a Benchmark object that I can print as
before.

	#!/usr/bin/perl
	use Benchmark;
	
	my $iterations = 1000;
	
	my $code = <<'CODE';
	foreach my $element ( 1 .. 1000 ) { $_ += $element }
	CODE
	
	my $time = timeit($iterations, $code);
	
	print "It took ", timestr($time), "\n";
	__END__	

	It took 14 secs ( 8.49 usr  0.00 sys =  8.49 cpu)

CAUTION:  Don't compare benchmarks of CODE references and strings!  
Use the same for each technique
that you compare since there is extra overhead with the
eval() needed to benchmark the CODE reference.  This is true for timeit() and 
any of the Benchmark functions I will mention.

As I mentioned before, running a snippet of code several
times might have additional overhead unassociated with the
actual bits of code that you care about - the looping
constructs, for instance. One of the advantages of the
Benchmark module is that timeit() will run an empty string
for the same number of iterations and subtract that time
from the time to run your code.  This should take care of
any extra overhead introduced by the benchmarking code. 
There are several methods in the module that let you have a
finer control over this feature and each is documented in
the manpage.

The function timethis() is similar to timeit(), but has
optional parameters for TITLE and STYLE. The TITLE parameter
allows you to give your snippet a name and STYLE affects the
format of the output.  The results are automatically printed
to STDOUT although timethis() still returns a Benchmark
object.  Internally, timethis() uses timeit().


EXAMPLE: SOME SUMS

Now that I know how long it took to run my bit of code, I am
curious if I can make the time shorter. Can I come up with
another way to do the same task, and, if I can, how does its
time compare to other ways?  Using the Benchmark module, I
can use timeit() for each bit of code However, the Benchmark module
anticipates this curiosity and provides a function to
compare several snippets of code.

The timethese() function is a wrapper around timethis().
The hash %SNIPPETS contains
snippet names as keys and either CODE references or strings as
values.  The function returns a list of Benchmark objects
for each snippet.  

XXX	my @benchmarks = timethese($ITERATIONS, \%SNIPPETS);

There is no guarantee which order
timethese() will time the snippets, so I have to keep track
of which order timethis() reports the results.  If I wanted
to do further programmatic calculations with the times, I
could store the list returned by timethese(), but for now I
will simply rely on the information printed from timethis().
But first, I need something to compare.

To demonstrate timethese(), I want to compare five methods
of summing an array.  I give each snippet a name based on my
impression of it. The Idiomatic method is the standard use
of foreach().  Even if this method is slower than another, I
would use@array'	        
			Curious   =>  '$sum=0; grep { /^(\d+)$/ and $sum += $1 } @array',
	        Silly     => q|$sum=0; $_ = join 'just another new york perl hacker', @array;
	                                 while( m/(\d+)/g ) { $sum += $1 }|
	                );
	
	timethese($iterations, \%snippets);
	
	__END__
	
	 Idiomatic:  1 secs ( 1.08 usr  0.00 sys =  1.08 cpu)
	      Evil:  2 secs ( 1.44 usr  0.00 sys =  1.44 cpu)
	   Builtin:  1 secs ( 0.24 usr  0.00 sys =  0.24 cpu)
	            (warning: too few iterations for a reliable count)
	   Curious:  8 secs ( 8.17 usr  0.00 sys =  8.17 cpu)
	     Silly: 11 secs ( 7.26 usr  0.01 sys =  7.27 cpu)
		
		
As I see, the sum() function from builtin is *very* fast. In
fact, it was so fast that for 10000 iterations the Benchmark
module couldn't measure a reliable time. the Idiomatic
method is slightly faster than the "clever" use of map, but
both are significantly slower than sum(), and the other methods,
which I never expected to be fast, are indeed quite slow.

This comparison does not satisfy me though.  What happens as
the size of the array and the number of iterations changes?
Over the past several weeks I ran several combinations of
array size and iterations for each of the methods.  Since I
don't really care about the Curious and Silly methods, I
only report the results for the Idiomatic, Evil, and Builtin
methods. I ran each with arrays of sizes from 10 to 10000
elements and iterations from 1000 to 1000000 times.  The
longest time took about 86000 CPU seconds of a Sparc20 (96
Mb RAM, Perl 5.004) - that's about 1 CPU day! Don't try this
without telling the system administrator what you are doing,
especially if you called your script "test" - it's not nice
to get email nastygrams from an administrator who thinks you
have a script running amok when it's really doing exactly
what you want it to do. Not that this happened to me and you
can't prove it anyway.

[listing 6 - table comparing times for foreach vs. map vs.
builtin with different number of iterations and varying
array sizes]

SO WHAT HAVE I LEARNED?

The "stopwatch" approach is not very effective for timing
a peice of code because the CPU can work on other things before
it finishes the one I want to time.  The Perl builtin times() is 
a little better, but my code might run faster than times() can
measure.  I don't have to worry about these issues when
I use the Benchmark module, which can average the time to run
bits of code over many iterations.

In my summing example, I discovered that the "clever" use of map() was 
consistently slower than the idiomatic foreach(), which was much slower
than sum().  Although foreach() would be the idomatic way to sum an
array, I am hard-pressed to explain why absorbing an order
of magnitude speed penalty is good - I will be using sum(). 
If you think that you have a faster method, you now have the
tools to test it.  If your method beats sum(), send me a
note (and Graham too)!

REFERENCES

builtin module, available from CPAN <URL:http://www.perl.com/CPAN>
Benchmark module, included with the standard distribtion, perldoc Benchmark

--

Smith Renaud, Inc <URL:http://www.smithrenaud.com> makes brian
optimize his code.  While benchmarking, he reads mail sent to
<comdog@computerdog.com>.