performance comparison

Boris Glawe · Aug 27, 2004

Hi,

I am implementing one and the same algorithm in different programming languages,
in order to compare their performance. It's a backtracking algorithm, which
takes over 115mio recursions to terminate, where the recursion depth isn't
deeper then 49.

This algorithm searches for a trip with the horse on a chess board, which visits
every fields exactly once. On a 7x7 board starting from field (4,4), the times
of execution on my Athlon XP 2800 are as follows

Fastest is C: 4.1 seconds
then C++: 4.5 seconds

Now I tried ruby: 16 minutes !!

I included the code below. Is this performance difference normal ? We are
talking about the factor 62 !! Maybe I have done something wrong in my
algorithm, but I don't know what. It's very straight and doesn't use any
expensive container operations:

The program consists of 3 Files: Board.rb, Field.rb, springer.rb

Simply execute springer.rb with 3 Parameters: dimension of the board,
startposition_x and startposition_y

The example from above on a unixshell:

../springer 7 4 4

What's the reason for this big performance difference ? Can one increase speed
in any way?? Maybe the ruby compiler has to recompile things again and again in
this recursion and I don't know it?

################ Board.rb
############################################################################

require "Field"

class Board

private

attr :num_visisted_fields, true
attr_reader :fields
attr_reader :num_fields

def initialize(dimension)
@dimension = dimension
@fields = []
@trip = []
@num_fields = dimension*dimension
@num_visited_fields = 0
@recursion_count = 0
create_fields
search_neighbours
end

def create_fields
for i in 0...dimension
line = []
for j in 0...dimension
line.push(Field.new(i,j))
end
@fields.push(line)
end
end

def search_neighbours
fields.each do |line|
line.each do |current_field|
tmp_field = field(current_field.pos_x + 2, current_field.pos_y + 1)
current_field.neighbours.push(tmp_field) unless ! tmp_field

tmp_field = field(current_field.pos_x + 1, current_field.pos_y + 2)
current_field.neighbours.push(tmp_field) unless ! tmp_field

tmp_field = field(current_field.pos_x - 1, current_field.pos_y + 2)
current_field.neighbours.push(tmp_field) unless ! tmp_field

tmp_field = field(current_field.pos_x - 2, current_field.pos_y + 1)
current_field.neighbours.push(tmp_field) unless ! tmp_field

tmp_field = field(current_field.pos_x - 2, current_field.pos_y - 1)
current_field.neighbours.push(tmp_field) unless ! tmp_field

tmp_field = field(current_field.pos_x - 1, current_field.pos_y - 2)
current_field.neighbours.push(tmp_field) unless ! tmp_field

tmp_field = field(current_field.pos_x + 1, current_field.pos_y - 2)
current_field.neighbours.push(tmp_field) unless ! tmp_field

tmp_field = field(current_field.pos_x + 2, current_field.pos_y - 1)
current_field.neighbours.push(tmp_field) unless ! tmp_field
end
end
end

public
attr :dimension, true
attr_reader :trip
attr_reader :recursion_count

def find_trip (current_field)

@recursion_count += 1

current_field.visited = true
@num_visited_fields += 1

if @num_visited_fields >= @num_fields
trip.unshift(current_field)
return
end

for neighbour in current_field.neighbours

next if neighbour.visited

find_trip(neighbour)
if @num_visited_fields >= @num_fields
trip.unshift(current_field)
return
end
end

current_field.visited = false
@num_visited_fields -= 1
end

def field(x, y)
if (x >= @dimension || y >= @dimension || x < 0 || y < 0)
return nil
end
@fields[x][y]
end

def print_board
fields.each do |line|
line.each do |field|
print field.to_s
end
print "\n"
end
end
end

################ Field.rb
############################################################################

class Field

private

def initialize(x,y)
@pos_x = x
@pos_y = y
@visited = false
@neighbours = []
end

public
attr_reader

os_x,

os_y
attr :visited, true
attr :neighbours, true

def to_s
"(" + @pos_x.to_s + "," + @pos_y.to_s + ")"
end

end

################ springer.rb
############################################################################
#!/usr/bin/env ruby

require "Board"

if ARGV.length != 3
print "usage: springer <dimension> <start_x> <start_y>,\n" +
"where both <start_x> and <start_y> must be smaller then <dimension>\n"
Kernel.exit
end

begin
dimension = Integer(ARGV[0])
raise ArgumentError unless dimension > 0

start_x = Integer(ARGV[1])
raise ArgumentError unless start_x < dimension && start_x >= 0

start_y = Integer(ARGV[2])
raise ArgumentError unless start_y < dimension && start_y >= 0

rescue ArgumentError

print "Parameters must be numeric and useful\n"
Kernel.exit

end

board = Board.new(dimension)

start_field = board.field(start_x, start_y)

board.find_trip(start_field)

board.trip.each do |field|
print field.to_s, "\n"
end
print "number of recursions: " + board.recursion_count.to_s + "\n"

############################################################################

Joel VanderWerf · Aug 27, 2004

Boris said:
def field(x, y)
if (x >= @dimension || y >= @dimension || x < 0 || y < 0)
return nil
end
@fields[x][y]
end

One small improvement (not likely to affect speed much):

def field(x, y)
ary = @fields[x]
ary && ary[y]
end

This will still return nil in the same out-of-bounds cases.

Olivier D. · Aug 27, 2004

Fastest is C: 4.1 seconds
then C++: 4.5 seconds

Now I tried ruby: 16 minutes !!

I have 74ms on my P3-450MHz, that must be a mistake in my implementation
but: could you give us the output of './springer.rb 7 4 4' you were
expecting. I have: (4, 4) (6, 5) ... with 44 recursions.

def search_neighbours
fields.each do |line|
line.each do |current_field|
tmp_field = field(current_field.pos_x + 2, current_field.pos_y + 1)
current_field.neighbours.push(tmp_field) unless ! tmp_field
...
end
end
end

'unless !' is called 'if' in Ruby

and I have a small optimisation:

KNIGHT_MOVES = [ [2, 1], [-1, 2], [-2, -1], [-2, 1] ]

def search_neighbours
@fields.each do |line|
line.each do |current_field|
KNIGHT_MOVES.each do |move|
tmp = field( current_field.x + move[0],
current_field.y + move[1] )
current_field.neighbours << tmp if tmp
tmp = field( current_field.x + move[1],
current_field.y + move[0] )
current_field.neighbours << tmp if tmp
end
end
end
end

Boris Glawe · Aug 27, 2004

the method is called during the initialization only. the algorithm itself
doesn't use this, so it won't influence performance in any way. thanks anyway !

greets Boris

Boris Glawe · Aug 27, 2004

Olivier said:
I have 74ms on my P3-450MHz, that must be a mistake in my implementation
but: could you give us the output of './springer.rb 7 4 4' you were
expecting. I have: (4, 4) (6, 5) ... with 44 recursions.

44 recursions is impossible, since it needs at least 49 recursions to finish (a
7x7 board has 49 fields). Here is the output of my C++ version, since I don't
want to wait another 16 minutes. The output of my ruby version was the same though.

(0,0);
(2,1);
(0,2);
(1,0);
(3,1);
(4,3);
(6,2);
(5,0);
(4,2);
(6,3);
(5,1);
(3,0);
(2,2);
(0,3);
(1,1);
(3,2);
(5,3);
(6,1);
(4,0);
(5,2);
(6,0);
(4,1);
(2,0);
(0,1);
(1,3);
(0,5);
(2,6);
(3,4);
(5,5);
(3,6);
(1,5);
(2,3);
(0,4);
(1,6);
(2,4);
(1,2);
(3,3);
(5,4);
(6,6);
(4,5);
(6,4);
(5,6);
(3,5);
(1,4);
(0,6);
(2,5);
(4,6);
(6,5);
(4,4);
number of recursions 115583274

def search_neighbours
fields.each do |line|
line.each do |current_field|
tmp_field = field(current_field.pos_x + 2, current_field.pos_y + 1)
current_field.neighbours.push(tmp_field) unless ! tmp_field
...
end
end
end

Click to expand...

'unless !' is called 'if' in Ruby and I have a small optimisation:

KNIGHT_MOVES = [ [2, 1], [-1, 2], [-2, -1], [-2, 1] ]

def search_neighbours
@fields.each do |line|
line.each do |current_field|
KNIGHT_MOVES.each do |move|
tmp = field( current_field.x + move[0],
current_field.y + move[1] )
current_field.neighbours << tmp if tmp
tmp = field( current_field.x + move[1],
current_field.y + move[0] )
current_field.neighbours << tmp if tmp
end
end
end
end

thanks, this is more elegant. But doesn't help with the performance, since
search_neighbours is called only once, by the board's constructor. thanks anyway
! greets Boris

Florian Gross · Aug 27, 2004

Boris said:
Hi,

Moin!

Let me point out two small optimizations. Maybe they will accumulate.

def find_trip (current_field)

@recursion_count += 1

current_field.visited = true
@num_visited_fields += 1

if @num_visited_fields >= @num_fields
trip.unshift(current_field) @trip.unshift(current_field)
return
end

for neighbour in current_field.neighbours

next if neighbour.visited

find_trip(neighbour)
if @num_visited_fields >= @num_fields
trip.unshift(current_field) @trip.unshift(current_field)
return
end
end

current_field.visited = false
@num_visited_fields -= 1
end

Another thing you might want to note is that Array#unshift is quite a
lot slower than Array#push (100000 iterations of #unshift take ~25
seconds and 100000 iterations of #push only ~0.07 seconds.)

Output however only happens once so you could get away with just
iterating over the Array in reverse there. (via #reverse_each)

Regards,
Florian Gross

George Ogata · Aug 27, 2004

Boris Glawe said:
What's the reason for this big performance difference ?

I don't know what your C code looks like, but when you consider what
must go in a typical C statement vs. an equivalent ruby statement, I
don't find it too surprising. How do other interpreted languages
(e.g., python) go?

As a simpler example, observe a dumb-Fibonacci:

----- sources ----

g@crash:~/tmp$ #### C version ####
g@crash:~/tmp$ cat test.c
#include <stdio.h>

int fib(int n) {
switch (n) {
case 0:
return 0;
case 1:
return 1;
default:
return fib(n-1) + fib(n-2);
}
}

int main(int argc, char **argv) {
int n;
sscanf(argv[1], "%d", &n);
printf("%d\n", fib(n));
return 0;
}
g@crash:~/tmp$ #### ruby version ####
g@crash:~/tmp$ cat test.rb
def fib n
if n > 1
fib(n-1) + fib(n-2)
elsif n.zero?
0
else
1
end
end

puts fib(ARGV[0].to_i)
g@crash:~/tmp$ #### python version ####
g@crash:~/tmp$ cat test.py
import sys

def fib(n):
if n > 1:
return fib(n-1) + fib(n-2)
elif n == 0:
return 0
else:
return 1

print fib(int(sys.argv[1]))

----- runs -----

g@crash:~/tmp$ gcc test.c && /usr/bin/time ./a.out 35
9227465
0.44user 0.00system 0:00.44elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (100major+13minor)pagefaults 0swaps
g@crash:~/tmp$ /usr/bin/time ruby test.rb 35
9227465
22.95user 0.07system 0:23.02elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (243major+168minor)pagefaults 0swaps
g@crash:~/tmp$ /usr/bin/time python test.py 35
9227465
17.24user 0.00system 0:17.30elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (370major+250minor)pagefaults 0swaps

That's about 52x longer for ruby. Not as big a coefficient as your
program, but the difference between the implementations (in terms of
complexity of the expressions evaluated) in my example is probably
much less than in yours due to the simplicity of the programs. Again,
this is highly dependant on what your C version looks like.

HTH.

James Edward Gray II · Aug 27, 2004

Hi,

I am implementing one and the same algorithm in different programming
languages, in order to compare their performance. It's a backtracking
algorithm, which takes over 115mio recursions to terminate, where the
recursion depth isn't deeper then 49.

[snip description and code]

One of the problems for me hunting through your code is that it's kind
of lengthy. Any chance we could simplify it down to something a little
easier to swallow?

Playing around a little on my own, I've come up with:

#!/usr/bin/ruby -w

class KnightsTour
def KnightsTour.square_to_indices( square )
return (square[0] - ?a), (square[1, 1].to_i - 1)
end

def KnightsTour.indices_to_square( rank, file )
return (?a + rank).chr + (file + 1).to_s
end

def initialize( size, square )
@size = size

@rank, @file = KnightsTour.square_to_indices square
end

def find( *path )
path = [ [ @rank, @file ] ] unless path.length > 0

j = jumps path[-1][0], path[-1][1]
j.delete_if do |a|
path.find { |b| a == b }
end

if j.length > 0 and path.length + 1 == @size * @size
path << j[0]
return path
end

result = nil

j.each do |jump|
to = path.dup
to << jump

result = find( *to )

break if result
end

return result
end

private

def jumps( rank, file )
pos = [ [ rank - 1, file - 2 ], [ rank - 1, file + 2 ],
[ rank - 2, file - 1 ], [ rank - 2, file + 1 ],
[ rank + 1, file - 2 ], [ rank + 1, file + 2 ],
[ rank + 2, file - 1 ], [ rank + 2, file + 1 ] ]
pos.delete_if do |jump|
jump[0] < 0 or jump[1] < 0 or
jump[0] >= @size or jump[1] >= @size
end
return pos
end
end

unless ARGV.length == 2 and ARGV[0] =~ /^[3-8]+$/ and ARGV[1] =~
/^[a-h][1-8]$/
puts "Usage: tour.rb BOARD_SIZE STARTING_SQUARE\n" +
"\tBOARD_SIZE is a digit between 1 and 8\n" +
"\tSTARTING_SQUARE is a square in algerbraic notation\n"
exit
end

tour = KnightsTour.new(ARGV[0].to_i, ARGV[1]).find()
if tour
tour.each { |e| puts KnightsTour.indices_to_square(e[0], e[1]) }
else
puts "No tour found."
end

__END__

My version isn't identical to yours and I doubt it's much faster, if at
all. It is though easier to troubleshoot, for me at least.

Just a thought. Sorry I wasn't more help.

James Edward Gray II

James Edward Gray II · Aug 27, 2004

I don't know what your C code looks like, but when you consider what
must go in a typical C statement vs. an equivalent ruby statement, I
don't find it too surprising. How do other interpreted languages
(e.g., python) go?

I can't speak for Python, but I have a lot of experience with coding
chess searches in Perl.

Unfortunately, I've never been able to achieve reasonable performance
there either. I've had some luck using a simple string as my data
structure and avoiding higher level manipulations. I've also tried
unrolling recursion, again with some success.

If memory serves, my personal best is somewhere around 40x slower than
C. Of course, my C skills aren't anything to brag about so that may be
giving Perl an advantage... <laughs>

James

David Ross · Aug 27, 2004

I can't speak for Python, but I have a lot of
experience with coding
chess searches in Perl.

Unfortunately, I've never been able to achieve
reasonable performance
there either. I've had some luck using a simple
string as my data
structure and avoiding higher level manipulations.
I've also tried
unrolling recursion, again with some success.

If memory serves, my personal best is somewhere
around 40x slower than
C. Of course, my C skills aren't anything to brag
about so that may be
giving Perl an advantage... <laughs>

James

Right, of course interpreted languages will be slower.
They have extra overhead. C programs do as well, but
not much. Garbage collection used to be slow many many
years ago, but have made drastic improvement. I will
probably get yelled at for saying this but note that I
am not biased to what I am going to say(I love both C,
C++, ada, asm and ruby. I'm am only strong in these

I know more.)

When computers get really fast, what language you use
whether it have a huge overhead or not will be
irrelevant. Of course Operating systems will always be
written in C and maybe C++. BSD OSes I havent seen any
C++ code, but there is C++ code in Windows NT.

I can just see a bright future where programming will
all be a breeze with easy debugging and not spending
countless hours on a typo

----------------------------------------
-- Name: David Ross
-- Phone: 865.539.3798
-- Email: drossruby [at] yahoo [dot] com
----------------------------------------

__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!
http://promotions.yahoo.com/new_mail

Alexey Verkhovsky · Aug 27, 2004

I can just see a bright future where programming will
all be a breeze with easy debugging and not spending
countless hours on a typo

Insofar as easy debugging and not making typos is concerned, Java +
decent IDE (Intellij or Eclipse) delivers it even today

Boris Glawe · Aug 27, 2004

Florian said:
Moin!

Let me point out two small optimizations. Maybe they will accumulate.

Another thing you might want to note is that Array#unshift is quite a
lot slower than Array#push (100000 iterations of #unshift take ~25
seconds and 100000 iterations of #push only ~0.07 seconds.)

The unshift operation is executed 49 times on a 7x7 Board. You find those
unshift statements only within if-blocks. These if-blocks are executed, if the
search was successfull. Thus the unshift operation ist not the thing, that makes
the algorithm slow.

the recursion is not that easy to understand. If you don't believe me, simply
increase another counter as soon as one of the if-blocks are being executed.

Boris Glawe · Aug 27, 2004

My code is quit long, but the part, that is executed is not very long. The only
thing, that runs for 16 minutes is Board#find_trip

The rest of the lengthy code is for initialization purposes.

The constructor of the Board creates a square with dimension*dimension fields
(instances of Field on a two dimensional array). Then the constructor finds the
neighbours for each field and saves this list of neighbours within the field
which is done by the private method Board#search_neighbours

Most of this code isn't used anymore in Board#find_trip

greets Boris

Boris Glawe · Aug 27, 2004

Another thing you might want to note is that Array#unshift is quite a
lot slower than Array#push (100000 iterations of #unshift take ~25
seconds and 100000 iterations of #push only ~0.07 seconds.)

The unshift operation is executed 49 times on a 7x7 Board. You find those
unshift statements only within if-blocks. These if-blocks are executed, if the
search was successfull. Thus the unshift operation ist not the thing, that makes
the algorithm slow.

the recursion is not that easy to understand. If you don't believe me, simply
increase another counter as soon as one of the if-blocks are being executed.

Lothar Scholz · Aug 27, 2004

Hello James,

JEGI> I can't speak for Python, but I have a lot of experience with coding
JEGI> chess searches in Perl.

JEGI> Unfortunately, I've never been able to achieve reasonable performance
JEGI> there either. I've had some luck using a simple string as my data
JEGI> structure and avoiding higher level manipulations. I've also tried
JEGI> unrolling recursion, again with some success.

JEGI> If memory serves, my personal best is somewhere around 40x slower than
JEGI> C. Of course, my C skills aren't anything to brag about so that may be
JEGI> giving Perl an advantage... <laughs>

On pure algorithm based problems you can expect a script language to
be around 20 - 150 times slower. Some time ago i started a (long)
thread in comp.lang.python where i posted an image manipulation
algorithm. Nobody was able to get it down the 120x line. This
algorithm was really bad, it couldn't use any available c function, so
one c instruction mapped to one python instruction.

So i doubt that there will ever be a time when a script language would
ever be useable for Zip compressors, image manipulation or KI
algorithms. Just because not only the CPU performance increases but
the data sets increases also, maybe faster then CPU performance.

One of the reasons why i don't understand the people on this mailing
list who want to do everything in ruby. I guess a lot of them are
simply to lazy to learn C and C++ well.

Lothar Scholz · Aug 27, 2004

Hello Alexey,

AV> Insofar as easy debugging and not making typos is concerned, Java +
AV> decent IDE (Intellij or Eclipse) delivers it even today

Typos are by definition easier to find in static typed programs.
This has nothing to do with the IDE, only with the compiler/parser.

Klaus Momberger · Aug 27, 2004

Boris Glawe said:
Hi,
....

I included the code below. Is this performance difference normal ? We are
talking about the factor 62 !! Maybe I have done something wrong in my
algorithm, but I don't know what. It's very straight and doesn't use any
expensive container operations:

yes, 2 orders of magnitude in performance between C and Ruby (Python, TCL...) is
usually quite normal. It is up to you to decide if this is "fast enough".
Otoh, if you can use a lot of Ruby built-ins, the result may be a lot better.

-klaus

David Ross · Aug 27, 2004

Typos are by definition easier to find in static
typed programs.

Oh really? That would explain the 20 million dollar
bug..

s=2 when it was meant to be s==2

--David

---------------------------------------
-- Name: David Ross
-- Phone: 865.539.3798
-- Email: drossuby [at] yahoo [dot] com
---------------------------------------

__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!
http://promotions.yahoo.com/new_mail

Charles Mills · Aug 28, 2004

Oh really? That would explain the 20 million dollar
bug..

s=2 when it was meant to be s==2

Did you get that deep C secret from "Expert C Programming"?

If I remember right he goes on to say that any half way decent version
of lint would have found that bug. Then he argues that lint should
have never been separated from the compiler in the first place. Not
sure but I think the bug was the reverse 's==2;' instead of 's=2;'.
The statement was all alone on one line and would be flagged with a
warning by just about any compiler these days - statement with no
effect.
-Charlie

--David

---------------------------------------
-- Name: David Ross
-- Phone: 865.539.3798
-- Email: drossuby [at] yahoo [dot] com
---------------------------------------

__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!
http://promotions.yahoo.com/new_mail

Alexey Verkhovsky · Aug 28, 2004

Hello Alexey, Hi,

Typos are by definition easier to find in static typed programs.

"Statically typed" is certainly the main enabling factor here, but...

I specifically mentioned Java IDEs, as opposed to other statically typed
languages that I dealt with (C, C++, Pascal (Delphi), Fortran). My
reasons were:

1.
Java language design tries very hard to protect a coder from himself,
E.g., if (a = 0) {} is a syntax error.

2.

This has nothing to do with the IDE, only with the compiler/parser.

Decent Java IDE gives you accurate code completion, continuous parsing
and compile on save. I.e., most of the time instead of typing names, you
select them from a drop-down of all visible identifiers in the scope
(with most likely choices at the top of the list). When you type
something, your typos get highlighted as soon as you type them, not
several minutes later when you compile and run unit tests. Correcting
them in this way doesn't break the flow at all - which is very pleasant.

3.
As a frequent code reviewer I've probably seen thousands of Java bugs,
but very small proportion of them were typos. This is drastically
different from my experience with all other languages, including both
Ruby and C/C++.

That's not to mention debugging, profiling, memory leak hunting and
other such things.

Best regards,
Alex

Problem with displaying character that code number is 219 (after SetConsoleTextAttribute)?	3	Jan 9, 2023
Need help in debugging tic tac toe (Beginner)	0	Jun 28, 2023
Need help in debugging tic tac toe (beginner)	2	Jun 28, 2023
Trouble creating multi dimensional array. 0 to 26 in 3 dimensions.	1	Oct 12, 2022
Tic Tac Toe Game	2	Mar 10, 2024
Why is Python telling me variable is local not global?	3	Sep 2, 2023
Problem with IRB and some sample code...	15	Oct 13, 2005
Pointer to a 2 dimensional array?	21	Dec 15, 2008

performance comparison

Boris Glawe

Joel VanderWerf

Olivier D.

Boris Glawe

Boris Glawe

Florian Gross

George Ogata

James Edward Gray II

James Edward Gray II

David Ross

Alexey Verkhovsky

Boris Glawe

Boris Glawe

Boris Glawe

Lothar Scholz

Lothar Scholz

Klaus Momberger

David Ross

Charles Mills

Alexey Verkhovsky

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads