python vs perl performance test

I

igor.tatarinov

First, let me admit that the test is pretty dumb (someone else
suggested it :) but since I am new to Python, I am using it to learn
how to write efficient code.

my $sum = 0;
foreach (1..100000) {
my $str = chr(rand(128)) x 1024;
foreach (1..100) {
my $substr = substr($str, rand(900), rand(100));
$sum += ord($substr);
}
}
print "Sum is $sum\n";

Basically, the script creates random strings, then extracts random
substrings, and adds up the first characters in each substring. This
perl script takes 8.3 secs on my box and it can probably be improved.

When I first translated it to Python verbatim, the Python script took
almost 30 secs to run.
So far, the best I can do is 11.2 secs using this:

from random import randrange
from itertools import imap, repeat
from operator import getitem, add, getslice

result = 0
zeros = [0]*100
for i in xrange (100000):
s = [chr(randrange(128))] * 1024
starts = repeat(randrange(900), 100)
ends = imap(add, starts, repeat(randrange(1,100), 100))
substrs = imap(getslice, repeat(s, 100), starts, ends)
result += sum(imap(ord, imap(getitem, substrs, zeros)))

print "Sum is ", result

There's got to be a simpler and more efficient way to do this.
Can you help?

Thanks,
igor
 
C

Chris Mellon

First, let me admit that the test is pretty dumb (someone else
suggested it :) but since I am new to Python, I am using it to learn
how to write efficient code.

my $sum = 0;
foreach (1..100000) {
my $str = chr(rand(128)) x 1024;
foreach (1..100) {
my $substr = substr($str, rand(900), rand(100));
$sum += ord($substr);
}
}
print "Sum is $sum\n";
Basically, the script creates random strings, then extracts random
substrings, and adds up the first characters in each substring. This
perl script takes 8.3 secs on my box and it can probably be improved.

When I first translated it to Python verbatim, the Python script took
almost 30 secs to run.
So far, the best I can do is 11.2 secs using this:

from random import randrange
from itertools import imap, repeat
from operator import getitem, add, getslice

result = 0
zeros = [0]*100
for i in xrange (100000):
s = [chr(randrange(128))] * 1024
starts = repeat(randrange(900), 100)
ends = imap(add, starts, repeat(randrange(1,100), 100))
substrs = imap(getslice, repeat(s, 100), starts, ends)
result += sum(imap(ord, imap(getitem, substrs, zeros)))

print "Sum is ", result

There's got to be a simpler and more efficient way to do this.
Can you help?

Benchmarking is usually done to test the speed of an operation. What
are you trying to measure with this test? String slicing? Numerical
operations? Looping? You're doing all sorts of bogus work for no
reason. The use of randomness is also totally arbitrary and doesn't do
anything except make it harder to confirm the correctness of the test.
Is the ability to generate 0-length substrings (which perl claims have
an ordinal of 0) intentional?

For the record, taking the randomness out makes this take 4 seconds
for perl, and 6 seconds for a literal translation in python. Moving it
into a function drops the python function to 3.5 seconds.
 
A

Arnaud Delobelle

from random import randrange
from itertools import imap, repeat
from operator import getitem, add, getslice

result = 0
zeros = [0]*100
for i in xrange (100000):
s = [chr(randrange(128))] * 1024
starts = repeat(randrange(900), 100)
ends = imap(add, starts, repeat(randrange(1,100), 100))
substrs = imap(getslice, repeat(s, 100), starts, ends)
result += sum(imap(ord, imap(getitem, substrs, zeros)))

print "Sum is ", result

There's got to be a simpler and more efficient way to do this.
Can you help?

Thanks,
igor

repeat(randrange(n), p) doesn't do what you want it to:
It generates a single random number less thant n and repeats
it p times
[54, 54, 54, 54, 54, 54, 54, 54, 54, 54]

Instead you want imap(randrange, repeat(n, p)):
[69, 68, 81, 26, 60, 76, 40, 55, 76, 75]

I don't understand why you make the Python version so complicated.
 
J

Jakub Stolarski

There's got to be a simpler and more efficient way to do this.
Can you help?

Thanks,
igor

If all you need is the result here's simpler and more efficient code:

from random import randrange
sum = 100 * randrange(128)
print "Sum is ", sum

And the same in perl:

my $sum = 100 * int(rand(128));
print "Sum is $sum\n";

If you really want compare performance then look at Computer Language
Benchmarks Game:
http://shootout.alioth.debian.org/gp4/benchmark.php?test=all&lang=python&lang2=perl
 
I

igor.tatarinov

If all you need is the result here's simpler and more efficient code:

from random import randrange
sum = 100 * randrange(128)
print "Sum is ", sum

And the same in perl:

my $sum = 100 * int(rand(128));
print "Sum is $sum\n";

If you really want compare performance then look at Computer Language
Benchmarks Game:http://shootout.alioth.debian.org/gp4/benchmark.php?test=all〈=pyt...

Sorry, this wasn't meant to be a benchmark of any kind and I don't
care about the results of the program. It was just an exercise to
figure out why the verbatim translation was so slow. It looks like
Python's randrange() is very slow. random() is much faster but then I
have to use int(N*random()) to get an integer, which is pretty slow
too.

Initially, I thought the difference was due to Python's slicing
implementation that doesn't seem to be the case: with fixed arguments
Perl's substr() and Python's slicing seem to have identical
performance.
 
S

Scott David Daniels

... When I first translated it to Python verbatim,
the Python script took almost 30 secs to run.
So far, the best I can do is 11.2 secs using this:

from random import randrange
from itertools import imap, repeat
from operator import getitem, add, getslice

result = 0
zeros = [0]*100
for i in xrange (100000):
s = [chr(randrange(128))] * 1024
This doesn't do what you think it does, I'll wager.
Try:
s = chr(randrange(128)) * 1024
to get an equivalent result.
or try:
s = ''.join([chr(randrange(128)) for i in range(1024)])

-Scott
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top