Python versus Perl

  • Thread starter Dieter Vanderelst
  • Start date
D

Dieter Vanderelst

Dear all,

I'm currently comparing Python versus Perl to use in a project that
involved a lot of text processing. I'm trying to determine what the most
efficient language would be for our purposes. I have to admit that,
although I'm very familiar with Python, I'm complete Perl noob (and I
hope to stay one) which is reflected in my questions.

I know that the web offers a lot of resources on Python/Perl
differences. But I couldn't find a satisfying answer to my questions:

1 - How does the speed of execution of Perl compares to that of Python?

2 - Regular Expressions are a valuable tool in text processing. I have
noticed that Regular Expressions are executed very fast in Python. Does
anybody know whether Python executes RE faster than Perl does?

3 - In my opinion Python is very well suited for text processing. Does
Perl have any advantages over Python in the field of textprocessing
(like a larger standard library maybe).

I hope somebody can answer my questions. Of course, every remark and tip
on Python/Perl in texprocessing is most welcome.

With kind regards,
Dieter
 
M

Michael Sparks

Dieter said:
Dear all,

I'm currently comparing Python versus Perl to use in a project that
involved a lot of text processing. I'm trying to determine what the
most efficient language would be for our purposes. I have to admit
that, although I'm very familiar with Python, I'm complete Perl noob
(and I hope to stay one) which is reflected in my questions.

I know that the web offers a lot of resources on Python/Perl
differences. But I couldn't find a satisfying answer to my questions:

1 - How does the speed of execution of Perl compares to that of
Python?

Much of a muchness in my experience.(Qualitative, not quantative)
2 - Regular Expressions are a valuable tool in text processing. I have
noticed that Regular Expressions are executed very fast in Python.
Does anybody know whether Python executes RE faster than Perl does?
3 - In my opinion Python is very well suited for text processing. Does
Perl have any advantages over Python in the field of textprocessing
(like a larger standard library maybe).

These two are related. If you're writing code and you expect to be
using *a lot* of regular expression [*] type code then you may find perl
more convenient.

[*] That /might/ suggest you're taking the wrong approach mind you...

Python, for me, tends to be more readable, both immediately after
writing and if I go back to a year later - for maintenance, extension
etc.

Personally I like both languages for day in day out use, but these days
tend to choose python if I think I'm likely to want to modify or extend
the code. With the exception being where I'm doing heavy text
processing work that I think will be more maintainable in perl, or I'm
really sure I won't have to maintain it. (eg quick and dirty scripts)

One side effect of perl usage though is that due to them being easy to
use and convenient, they can get over used. (Rather than thinking
"what's the best way of solving this problem", people can end up
thinking "What regular expression can solve this problem" - which isn't
ideal)

Your comment """I'm complete Perl noob (and I hope to stay one) """
would suggest to me that if you really feel that way, stay that way :)
(Though personally I do like learning new programming languages, since
you get more idioms and tools under your belt that way.)
I hope somebody can answer my questions. Of course, every remark and
tip on Python/Perl in texprocessing is most welcome.

In case you're not aware there's the book "Text Processing in Python" by
David Mertz, which is available online in a "free as in beer" form
which might be of use if you decide python instead of perl.


Michael.
--
(e-mail address removed), http://kamaelia.sourceforge.net/
British Broadcasting Corporation, Research and Development
Kingswood Warren, Surrey KT20 6NP

This message (and any attachments) may contain personal views
which are not the views of the BBC unless specifically stated.
 
T

Terry Hancock

Your comment """I'm complete Perl noob (and I hope to stay one) """
would suggest to me that if you really feel that way, stay that way :)

I missed that on the first reading. IMHO, people love perl *really*
because it was the first language of its type. However, we learned
a lot from that experience, and have since made better languages
in the same general category. The best of these of course, is
Python. ;-)

I felt that way about C, and occasionally Fortran. But I've gotten
over it. ;-)

I took Perl classes after I learned Python, and I haven't found
anything Perl is enough better suited to do that it is worth the
trouble of messing with it. Yes, the one and two liner programs are
nice, but now that six months have passed and I can no longer remember
Perl syntax, it's a lot easier to do it in Python, even if I do wind
up using, say, 4 lines of code.

The biggest distinction I got from looking at Perl from the perspective
of Python is that:

1) Perl makes regular expressions first-class objects, which makes them
really easy to use, and a "beginner" subject in a Perl class.

2) Python makes objects and classes really easy to use, so they are a
"beginner" subject.

However, each can do the other when pressed. So which would you rather
have be easy?

Regular expression program makes huge incomprehensible piles of
gobblygook which you forget 10 seconds after you wrote it, while
objects and classes make it easy to understand the structure of
your program.

Even regular expressions are clearer in Python (IMHO) because of the
ability to apply string operations on them. Furthermore, the ready
availability of more direct methods of string manipulation encourages
more optimized and clearer design decisions (in Python if you just
want to find a word, you can just say so, instead of "crafting a
routine regular expression").

Performance is a complete non-issue. Both languages are reasonably
fast, and neither has a clear advantage on real world projects. Python
and Perl are "rivals" precisely because they are very similar in what
they can do.

So I'd second the suggestion to eschew the Perl if you can at all
get away with it. If you're already sold on Python, there's no
reason to question your judgement.

Cheers,
Terry
 
T

Thorsten Kampe

* Dieter Vanderelst (2005-09-06 18:03 +0100)
I'm currently comparing Python versus Perl to use in a project that
involved a lot of text processing. I'm trying to determine what the most
efficient language would be for our purposes. I have to admit that,
although I'm very familiar with Python, I'm complete Perl noob (and I
hope to stay one) which is reflected in my questions.

I know that the web offers a lot of resources on Python/Perl
differences. But I couldn't find a satisfying answer to my questions:

1 - How does the speed of execution of Perl compares to that of Python?

Of course Python is faster than Perl. It's the same reason why
Mercedes are faster than BMWs (or was it the other way round?).
2 - Regular Expressions are a valuable tool in text processing. I have
noticed that Regular Expressions are executed very fast in Python. Does
anybody know whether Python executes RE faster than Perl does?

Again: this question doesn't make sense. It's up to you to write your
Regular Expressions fast.
3 - In my opinion Python is very well suited for text processing. Does
Perl have any advantages over Python in the field of textprocessing
(like a larger standard library maybe).

I hope somebody can answer my questions. Of course, every remark and tip
on Python/Perl in texprocessing is most welcome.

http://gnosis.cx/TPiP/

"In case regular expression operations prove to be a genuinely
problematic performance bottleneck in an application, there are
four steps you should take in speeding things up. Try these in
order:

1. Think about whether there is a way to simplify the regular
expressions involved. Most especially, is it possible to
reduce the likelihood of backtracking during pattern
matching? You should always test your beliefs about such
simplification, however; performance characteristics rarely
turn out exactly as you expect.

2. Consider whether regular expressions are -really- needed
for the problem at hand. With surprising frequency, faster
and simpler operations in the [string] module (or,
occasionally, in other modules) do what needs to be done.
Actually, this step can often come earlier than the first
one.

3. Write the search or transformation in a faster and
lower-level engine, especially [mx.TextTools]. Low-level
modules will inevitably involve more work and considerably
more intense thinking about the problem. But
order-of-magnitude speed gains are often possible for the
work.

4. Code the application (or the relevant parts of it) in a
different programming language. If speed is the absolutely
first consideration in an application, Assembly, C, or C++
are going to win. Tools like swig--while outside the scope
of this book--can help you create custom extension modules
to perform bottleneck operations. There is a chance also
that if the problem -really must- be solved with regular
expressions that Perl's engine will be faster (but not
always, by any means)."
 
R

Roy Smith

Dieter Vanderelst said:
1 - How does the speed of execution of Perl compares to that of Python?

To a first-order approximation, Perl and Python run at the same speed.
They are both interpreted languages with roughly the same kind of control
flow and data structures. The two have wildly different kinds of syntax,
but the underlying machinery is fairly similar.

Don't make a decision on which to use based on execution speed. If you
feel compelled to ignore this advice, then don't make the decision until
you've coded your application in both languages and done careful
benchmarking with realistic data for your application domain.
I have noticed that Regular Expressions are executed very fast in
Python. Does anybody know whether Python executes RE faster than Perl
does?

Same answer as above -- for a first cut, assume they are the same speed.
If you insist on a better answer, do measurements on real data.

The big advantage the Python has over Perl is speed of development and ease
of maintenance (i.e. a year from now, you can still understand what your
own code does).
 
T

Terry Reedy

To a first-order approximation, Perl and Python run at the same speed.

'Speed of execution' is a feature of an inplementation, not of languages
themselves. Different implementations of Python (for instance, CPython
versus CPython+Psyco) can vary in speed by more than a factor of 10 for
particular blocks of Python code.

(Yes, I know you are comparing the stock standard implementations, but my
point still stands.)
They are both interpreted languages.

To be useful, every language has to be interpreted sometime by something.
In the narrow technical sense that I presume you mean, 'interpretation'
versus 'compilation' is again an implementation feature, not a language
feature. As far as I know, neither Perl nor Python has an implementation
that directly interprets in the way that Basic or tokenized Basic once was.

I am being picky because various people have claimed that Python suffers in
popularity because it is known as an 'interpreted language'. So maybe
advocates should be more careful than we have been to not reinforce the
misunderstanding.

Terry J. Reedy
 
M

Michael Sparks

Terry Reedy wrote:
[...]
I am being picky because various people have claimed that Python suffers
in popularity because it is known as an 'interpreted language'. So maybe
advocates should be more careful than we have been to not reinforce the
misunderstanding.

I sometimes wonder if it might help people understand the situation if
people described as "interpreted in the same way Java is" (However I think
that risks confusing things since python doesn't generally come with a JIT
subsystem, yet).

That said, if you do describe it that way, it'd be more accurate to describe
the python binary as a compiler/runtime rather than interpreter since it'd
be more accurate.

After all:
$ python somefile.py

Is very close to being the same as:
$ javac somefile.java
$ java somefile.class

It strikes me as ironic that python would probably gain more credibility
with some circles if it had two binaries like this, even though it'd be a
step backwards from a usability perspective :)

Personally I agree that any language that is described as interpreted has an
image issue. However I'm not sure who's problem that is - some people claim
it's "Python's problem", however personally I'd view as a problem for the
people who buy into "interpretted bad, compiled good" argument. After all,
they're the ones limiting themselves, and missing out on a whole class of
languages (of which python is just one of course) !


Michael.
 
T

Terry Reedy

Michael Sparks said:
That said, if you do describe it that way, it'd be more accurate to
describe
the python binary as a compiler/runtime rather than interpreter since
it'd
be more accurate.

If Java calls its runtime bytecode interpreter a 'runtime' rather than
'interpreter', so can we. Ditto, if applicable, to .NET clr. Still more
accurate, I think, is 'intergrated compiler and runtime'.
It strikes me as ironic that python would probably gain more credibility
with some circles if it had two binaries like this, even though it'd be a
step backwards from a usability perspective :)

Yes. The integration is a practical necessity for interactive mode with
alternative compile and execute. For batch mode, integration consists of
the very nice built-in mini-make.
Personally I agree that any language that is described as interpreted has
an
image issue. However I'm not sure who's problem that is - some people
claim
it's "Python's problem", however personally I'd view as a problem for the
people who buy into "interpretted bad, compiled good" argument. After
all,
they're the ones limiting themselves, and missing out on a whole class of
languages (of which python is just one of course) !

That has been my response. And as a Python programmer, that is the end of
it. But as a responder/advocate, I am beginning to accept that the
misperception is wide enough to also be a bit my problem. Hence my small
effort for a more effective vocabulary. Thanks for your contribution.

Terry J. Reedy
 
D

Dennis Lee Bieber

I sometimes wonder if it might help people understand the situation if
people described as "interpreted in the same way Java is" (However I think
that risks confusing things since python doesn't generally come with a JIT
subsystem, yet).
I'd probably try pushing the pre-interpretation phase more heavily:
.... compiled to machine code for a virtual machine, just as Java (and
UCSD Pascal).

Heck -- isn't the goal of M$'s .NET to virtualize the entire
computer <G> {Ack -- next we'll find out M$ plans to implement IBM's
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top