Performance: Perl versus compiled programs

Y

Yash

I have a PERL program that reads from a large text file and does some
processing on every line it reads. If this same program is written in
C or any other compiled language, would I get significant performance
improvement?
I am primarily asking about whether writing a program in Perl can
cause significant performance impact.

Thanks
Yash
 
J

Josef Möllers

Yash said:
I have a PERL program that reads from a large text file and does some
processing on every line it reads. If this same program is written in
C or any other compiled language, would I get significant performance
improvement?
I am primarily asking about whether writing a program in Perl can
cause significant performance impact.

If it's processing files, the speed will probably depend mainly on the
speed with which you can stuff in the data.

Since perl programs are compiled into an intermediate language which is
the interpreted, the performance impact is small.
 
N

Nils Petter Vaskinn

Writing a bad program in perl can, but changing language woun't help.
If it's processing files, the speed will probably depend mainly on the
speed with which you can stuff in the data.

Since perl programs are compiled into an intermediate language which is
the interpreted, the performance impact is small.

The speed will depend either on (as you said) disk access (data speed), or
(if the processing is heavy) the efficiency of the algorithm used
(calculation speed).

OP: take a look at your algorithm instead of changing language, unless the
kind of processing you do is typically handled better by a specific
language (read: if there are pre-optimized libraries available to do the
processing you do by hand today).

If you have one part of you algorithm that actually can benefit from
beeing made in C you can make a module in C and call it from a perl
program, so you would only need to port the bits to be optimized. (Look
for "Extending Perl" in your copy of programming perl.)


Why don't you post some code so people could give you suggestions for
improvements?
 
C

ctcgag

I have a PERL program that reads from a large text file and does some
processing on every line it reads.

What kind of processing? I have to travel some distance. Will a plane,
car, bicycle, or walking be better?
If this same program is written in
C or any other compiled language, would I get significant performance
improvement?

From my general experience with programs easy to write in Perl, the rule of
thumb is that if the program would have been easy to write in C in the
first place, it will be much faster if written in C. If it would have been
hard to write in C, it won't be that much faster, anyway.

(Once you do all the hard stuff like array boundary checking, add flags for
missing or undef data and add code to check that flag at every step, write
your own hash and parsing functionality, etc., you basically have a
hand-made, bug- ridden Perl knock-off, and it will be about the same speed
or slower than real Perl).

YMMV
I am primarily asking about whether writing a program in Perl can
cause significant performance impact.

Yes, it can.

Xho
 
A

Alan J. Flavell

Yes, it can.

I'm sure your answer is precisely correct: "yes, it *can*".

You don't need this to be pointed out, but perhaps some readers would
be helped by it: I would want to ask more importantly, was it the
right question? That needs to be understood in context. There's
usually little to be gained from software that mostly gets the right
answer fast, if it also sometimes gets the wrong answer, or crashes
horribly. C is a fine weapon for those skilled in the arts of
weaponry, but it's also rather good at shooting the practitioner in
the foot.

Perl is IMHO an excellent language for developing a reliable and
resilient prototype. For many purposes, the end of the propotype
stage is the earliest stage at which considerations of code
performance are appropriate to be considered (rules 1 and 2, and
possibly 3, of optimization are "don't optimize yet").

Many of those Perl prototypes go on to be reliable and resilient
production systems coded in Perl ("don't optimize in the wrong
place"). If and when it becomes evident that the code is too slow (to
the extent that it's uneconomic to solve the problem by throwing some
more CPU at it), that's the time to consider re-implementing. The
work that was put into the prototype won't by any means have been
wasted! And at least you'll have some code to profile, so you can see
where most of the time is being spent (and I have to say that's rarely
obvious ab initio, it becomes clear only when the application has been
prototyped and is actually running).

I don't myself recall a situation where an application was ever turned
from being impractical into feasible merely by recoding in a different
language. I do, however, recall an occasion where a task that needed
3 days computation was re-examined from scratch, and a totally
different algorithm was used, bringing the time down to about 20
minutes. You're not likely to get that scale of savings by mere
choice of coding language: you've got to examine the problem at a
fundamental level.

That's my "take" on the topic, anyhow. We physicists are reputed to
code FORTRAN in any language, by the way ;-}
 
W

Walter Roberson

:I don't myself recall a situation where an application was ever turned
:from being impractical into feasible merely by recoding in a different
:language.

Programming for very limited computers. Some languages draw up
large libraries that are not very divisible. If you have very
limited resources, then programming in something like Forth can
end up with a practical program where the C library would not fit;
if you have a few more resources, then C might fit where C++ did not.
 
S

Stuart Moore

Alan said:
On Fri, 9 Jan 2004 (e-mail address removed) wrote:

I don't myself recall a situation where an application was ever turned
from being impractical into feasible merely by recoding in a different
language. I do, however, recall an occasion where a task that needed
3 days computation was re-examined from scratch, and a totally
different algorithm was used, bringing the time down to about 20
minutes. You're not likely to get that scale of savings by mere
choice of coding language: you've got to examine the problem at a
fundamental level.

I've had one: we were trying to write a gui front end to some perl
command line tools. We thought writing the gui in perl would make it
easier as we'd be throwing the same data structures around the whole
time. But we found that it was slow and very very memory hungry,
abandoned it and wrote in C++, using sockets to call (a modified version
of) the command line tools.

Perl is very good at text processing, which is (presumably) what the OP
wants, so it's unlikely that another language would get significantly
better. But where something has been bolted on to perl, it may not join
very well - this was the case in our situation.

(For reference, this was using wxPerl bindings to the wxWindows C++
toolkit, then using the wxWindows toolkit directly in C++. wxPerl was I
think relativley new and being improved all the time, so may be better
by now, but at the time wasn't good enough).

Stuart
 
T

Tore Aursand

I have a PERL program that reads from a large text file and does some
processing on every line it reads. If this same program is written in C
or any other compiled language, would I get significant performance
improvement?

Normally, no. It all depends on your solution; If the algorithm which
takes care of the processing of each line stinks in Perl, it probably will
in any other languages too.

Consider this simple code:

while ( <DATA> ) {
chomp;
print $_ if ( /something/ );
}

I doubt that you will be able to write a similar program with any other
language that is _much_ faster. The time you'll eventually save will get
lost in a harder implementation.
I am primarily asking about whether writing a program in Perl can cause
significant performance impact.

Every language _can_ cause significant performance impact. If you don't
know how to utilize a language best possible, you _can_ end up with a
solution with a solution below par.

If you have an existing script written in Perl, I think that most people
here would be glad to review it for you and factor out any performance
issues. If you _still_ think that your Perl script is too slow, then you
should consider other solution (not necessarily being better).
 
W

Walter Roberson

:Consider this simple code:

: while ( <DATA> ) {
: chomp;
: print $_ if ( /something/ );
: }

Useful code ifyou need allthe lines to runtogether when printed out.
 
T

Tore Aursand

Useful code ifyou need allthe lines to runtogether when printed out.

Except that it doesn't _necessarily_ print all the lines out together. If
I wanted a script to do that, I would have used something like this;

print (chomp && $_) while ( <DATA> );

Thus my first could easily have been shortened to this:

print (/something/ && $_) while ( <DATA> );
 
S

Sam Holden

Except that it doesn't _necessarily_ print all the lines out together.

All the lines that are printed out are run together. Which is clearly
what was meant by "all" the lines.
 
B

Ben Morrow

:Consider this simple code:

: while ( <DATA> ) {
: chomp;
: print $_ if ( /something/ );
: }

Useful code ifyou need allthe lines to runtogether when printed out.

That depends on the value of $\.

Ben
 
A

Alan J. Flavell

Actually there's another substantive comment to be made on this topic,
which perhaps calls for a separate f'up; but first:

Alan said:
I don't myself recall a situation where an application was ever turned
from being impractical into feasible merely by recoding in a different
language.
[...]

I've had one: we were trying to write a gui front end to some perl
command line tools. [...]

OK, point taken. Maybe I should have stressed more clearly that I was
reporting from personal experience, and didn't mean to imply that a
case might not be found if one looked hard enough.

On the other hand, I do recall a formative experience, even though
this was a considerable time back (does 360/91 ring any bells?). We
were developing a program where it was painfully obvious where the
bottleneck was going to be: the central part of the program was
inspecting thousands of co-ordinate triplets, and trying to
reconstruct track candidates in space. So in a fit of premature
optimisation, one of the programmers hand-coded that relatively small
amount of code in Assembler.

It turned out that the FORTRAN optimiser produced significantly more-
efficient code than the Assembler programmer. There were marginal
savings possible by hand-optimising the code that had been produced by
the compiler, although most attempts at optimization made the code
worse (due to the effects of pipelining in the 360/91, which the
optimising compiler knew very well how to exploit).

So the bottom line on *that* project was:

Hand-coded: worst
Compiler-generated: rather good
Hand-optimised: only slightly better, after a lot of effort
Perl is very good at text processing, which is (presumably) what the OP
wants, so it's unlikely that another language would get significantly
better.

I've used Perl successfully for lots of applications that wouldn't
really count as "text processing". But I see your point.

cheers
 
A

Alan J. Flavell

The other point that I had been meaning to raise was this:

| Subject : Performance: Perl versus compiled programs

Only one of the replies, as far as I can see, has chosen to comment on
this point, and that in a somewhat low-key fashion.

The questioner should be aware that Perl _is_ in a sense compiled
(into an intermediate code). In practice it's usually run as
compile-and-go, rather than in separate compilation/save-object and
run-saved-object steps. So it isn't a straight comparison between a
run-time interpretative language on the one hand, and a pre-compiled
binary on the other hand. Indeed, nowadays so-called compiled objects
are often just a relatively small code framework, which calls up
potentially massive external dynamic libraries at run time; which
obscures the comparison even further.

Premature optimisation is bad anyway (I think this message emerges
from most of the replies on this thread): premature optimisation based
on a misunderstanding of the issues would only make the mistake worse.

all the best
 
C

ctcgag

Alan J. Flavell said:
I don't myself recall a situation where an application was ever turned
from being impractical into feasible merely by recoding in a different
language.

I have several of those. Most of the recoding was done by way of
Inline::C. I love that module.

Xho
 
B

Bill

I have a PERL program that reads from a large text file and does some
processing on every line it reads. If this same program is written in
C or any other compiled language, would I get significant performance
improvement?
I am primarily asking about whether writing a program in Perl can
cause significant performance impact.

Thanks
Yash

C will usually beat perl hands down. In this simple example C is over
3 times faster.

#include <stdio.h>
int main(void) {
int i=0;
char line[BUFSIZ+1];
while(fgets(line, BUFSIZ, stdin) != NULL) {
i++;
}
printf ("-> %d\n", i);
return 0;
}


time ./a.out < /usr/dict/words
-> 230534

real 0m0.167s
user 0m0.160s
sys 0m0.000s


----------------------------------

#!/usr/local/bin/perl
use strict;
my $i=0;
while(<STDIN>) {
$i++;
}
print "-> $i\n";


localhost:~$ time ./try.pl < /usr/dict/words
-> 230534

real 0m0.548s
user 0m0.520s
sys 0m0.010s
 
W

Walter Roberson

:C will usually beat perl hands down. In this simple example C is over
:3 times faster.

Your simple example has a fair bit of startup time on the perl side.
You need to factor that out to figure out the relative execution
speeds over longer programs.
 
U

Uri Guttman

WR> In article <[email protected]>,
WR> :C will usually beat perl hands down. In this simple example C is over
WR> :3 times faster.

WR> Your simple example has a fair bit of startup time on the perl side.
WR> You need to factor that out to figure out the relative execution
WR> speeds over longer programs.

easily done by printing the time info from inside the perl script.

and remove the $i++ stuff from perl as it is doing that count with
$. already.

benchmarking is an art as much as anything else.

uri
 
M

Martien Verbruggen

That entirely depends on what you need to do. Text manipulation in
Perl is generally very fast and it is often not trivial to write a set
of tools that will outperform most of Perl's text manipulation.

Yes it can. If you use Perl for things it wasn't designed for. Or if
you need to start up your program many, many times for very small run
times (although there are ways around that).
C will usually beat perl hands down. In this simple example C is over
3 times faster.

That depends entirely on your processing and how much of Perl's
internal trickery you can use.

Apart from that, Perl programs generally have some startup cost
(parsing and compiling) that you need to factor out. If you have a
program that spends a large amount of time (relative to the startup
time) processing text, Perl might actually beat a C program.

There are many things that Perl is really, really fast at, for which
you would have to write large amounts of C code to achieve the same
speed. The equivalent Perl program is likely to have many fewer lines
of code, especially if builtin regular expressions, grep, map and
other Perl niceties can be used.
#include <stdio.h>
int main(void) {
int i=0;
char line[BUFSIZ+1];
while(fgets(line, BUFSIZ, stdin) != NULL) {
i++;
}
printf ("-> %d\n", i);
return 0;
}

This of course, is trivial stuff. You have fixed your input line
length, and don't correctly deal with possibly longer lines. You're
not manipulating the contents of the line buffer in any way (even
determining the length of the string, or replacing parts of the text
with something else would possibly already be slower than the Perl
equivalents). Perl strings are much smarter than C strings, and,
depending on what you do, can outperform certain operations on them.
length($buffer) is much faster than strlen(buffer) in the general
case, and O(1) instead of O(n). For long strings, this matters a lot.
The s/// operation in Perl is fast, and probably no slower than using
another regular expression library (or even Perl's own), and manually
replacing things; however, the amount of code to write and maintain is
much, much smaller.

You don't factor out the startup cost for the Perl program. Only
if your program has to run very often for small files would that
startup cost be important. If your program has to run once, for a
large file, the startup cost is likely to be insignificant.

In other words: It all depends, but your example is not at all
representative, or in any way indicative of what the OP could
realistically expect.


I generally use Perl first. If I then decide that there are
performance problems that can't trivially be fixed by using decent
hardware, I see whether I can extract some of the code, and rewrite
the slow bits in XS or C (with Inline::C), or maybe move part of the
processing to a server component that's faster. Writing everything in
C (or another machine-compiled language) is something I generally only
consider for soemthing that I know, beforehand, is time-critical, and
for which I can predict that Perl will be too slow, e.g. numerical
computations (even though there are modules for many computation
intensive things, like PDL).


Without a lot more information, it is absolutely impossible to predict
whether Perl or C will run faster, for the OP. It is likely that the
amount of development time the OP has to spend, will be much smaller
with Perl, however, and time-critical parts of the code could still be
written in C.


Martien
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top