Stack Overflow moderator â€œanimusonâ€

Joshua Landau · Jul 10, 2013

I don't know what you mean by that, but since the joke appears to have
flown over your head, I'll explain it. Steven's "pos" was clearly
mea

What? I don't understand.

Robert Kern · Jul 10, 2013

What? I don't understand.

Look, it's perfectly obvi

Joshua Landau · Jul 10, 2013

<Unjustified Insult>. [anumuson from Stack Overflow] has deleted all
my postings regarding Python regular expression matching being
extremely slow compared to Perl. Additionally my account has been
suspended for 7 days. <Unjustified Insult>.

Whilst I don't normally respond to trolls, I'm actually curious.

Do you have any non-trivial, properly benchmarked real-world examples
that this affects, remembering to use full Unicode support in Perl (as
Python has it by default)?

Remember to try on both major CPython versions, and PyPy -- all of
which are in large-scale usage. Remember not just to use the builtin
re module, as most people also use https://pypi.python.org/pypi/regex
and https://code.google.com/p/re2/ when they are appropriate, so
pathological cases for re aren't actually a concern anyone cares
about.

If you actually can satisfy these basic standards for a comparison (as
I'm sure any competent person with so much bravo could) I'd be willing
to converse with you. I'd like to see these results where Python compares
as "extremely slow". Note that, by your own wording, a 30% drop is irrelevant.

Antoon Pardon · Jul 10, 2013

Op 10-07-13 11:03, Mats Peterson schreef:

Not a troll. It's just hard to convince Python users that their beloved
language would have inferior regular expression performance to Perl.

All right, you have convinced me. Now what? Why should I care?

Joshua Landau · Jul 10, 2013

Google Groups is writing about your recently sent mail to "Joshua
Landau". Unfortunately this address has been discontinued from usage
for the foreseeable future. The sent message is displayed below:

Steve Simmons · Jul 10, 2013

Steven D'Aprano said:
That's by design. We don't want to make the same mistake as Perl, where

every problem is solved by a regular expression:

http://neilk.net/blog/2000/06/01/abigails-regex-to-test-for-prime-numbers/

so we deliberately make regexes as slow as possible so that programmers

will look for a better way to solve their problem. If you check the
source code for the re engine, you'll find that for certain regexes, it

busy-waits for anything up to 30 seconds at a time, deliberately
wasting
cycles.

The same with Unicode. We hate French people, you see, and so in an
effort to drive everyone back to ASCII-only text, Python 3.3 introduces

some memory optimizations that ensures that Unicode strings work
correctly and are up to four times smaller than they used to be. You
should get together with jmfauth, who has discovered our dastardly plot

and keeps posting benchmarks showing how on carefully contrived micro-
benchmarks using a beta version of Python 3.3, non-ASCII string
operations can be marginally slower than in 3.2.

dickwad.

I cannot imagine why he would have done that.

Thank you.

Sent from a Galaxy far far away

Joshua Landau · Jul 10, 2013

Op 10-07-13 11:03, Mats Peterson schreef:

All right, you have convinced me. Now what? Why should I care?

Isn't it obvious? Regex-based macros!

Skip Montanaro · Jul 10, 2013

... meant to be the word "posted", before his sentence got cut off by the

Python Secret Underground.

Argh! That which shall not be named! Please, for the sake of all that
is right, please only use the initials, PS

Mats Peterson · Jul 10, 2013

Antoon Pardon said:
Op 10-07-13 11:03, Mats Peterson schreef:

All right, you have convinced me. Now what? Why should I care?

Right. Why should you. And who cares about you?

Mats

Mats Peterson · Jul 10, 2013

Joshua Landau said:
<Unjustified Insult>. [anumuson from Stack Overflow] has deleted all
my postings regarding Python regular expression matching being
extremely slow compared to Perl. Additionally my account has been
suspended for 7 days. <Unjustified Insult>.

Click to expand...

Whilst I don't normally respond to trolls, I'm actually curious.

Do you have any non-trivial, properly benchmarked real-world examples
that this affects, remembering to use full Unicode support in Perl (as
Python has it by default)?

Remember to try on both major CPython versions, and PyPy -- all of
which are in large-scale usage. Remember not just to use the builtin
re module, as most people also use https://pypi.python.org/pypi/regex
and https://code.google.com/p/re2/ when they are appropriate, so
pathological cases for re aren't actually a concern anyone cares
about.

If you actually can satisfy these basic standards for a comparison (as
I'm sure any competent person with so much bravo could) I'd be willing
to converse with you. I'd like to see these results where Python compares
as "extremely slow". Note that, by your own wording, a 30% drop is irrelevant.

I haven't provided a "real-world" example, since I expect you Python
Einsteins to be able do an A/B test between Python and Perl yourselves
(provided you know Perl, of course, which I'm afraid is not always the
case). And why would I use any "custom" version of Python, when I don't
have to do that with Perl?

Mats

Mats Peterson · Jul 10, 2013

Steven D'Aprano said:
That's by design. We don't want to make the same mistake as Perl, where
every problem is solved by a regular expression:

http://neilk.net/blog/2000/06/01/abigails-regex-to-test-for-prime-numbers/

so we deliberately make regexes as slow as possible so that programmers
will look for a better way to solve their problem. If you check the
source code for the re engine, you'll find that for certain regexes, it
busy-waits for anything up to 30 seconds at a time, deliberately wasting
cycles.

The same with Unicode. We hate French people, you see, and so in an
effort to drive everyone back to ASCII-only text, Python 3.3 introduces
some memory optimizations that ensures that Unicode strings work
correctly and are up to four times smaller than they used to be. You
should get together with jmfauth, who has discovered our dastardly plot
and keeps posting benchmarks showing how on carefully contrived micro-
benchmarks using a beta version of Python 3.3, non-ASCII string
operations can be marginally slower than in 3.2.

I cannot imagine why he would have done that.

You're obviously trying hard to be funny. It fails miserably.

Mats

Joshua Landau · Jul 10, 2013

Right. Why should you. And who cares about you?

Not the Python Se

Mats Peterson · Jul 10, 2013

Chris Angelico said:
You do? And you haven't noticed the inferior performance of regular
expressions in Python compared to Perl? Then you obviously haven't
used them a lot.

Click to expand...

That would be correct. Why have I not used them all that much? Because
Python has way better ways of doing many things. Regexps are
notoriously hard to debug, largely because a nonmatching regex can't
give much information about _where_ it failed to match, and when I
parse strings, it's more often with (s)scanf notation instead - stuff
like this (Pike example as Python doesn't, afaik, have scanf support):

data="Hello, world! I am number 42.";
sscanf(data,"Hello, %s! I am number %d.",foo,x); (3) Result: 2
foo; (4) Result: "world"
x;

Click to expand...

(5) Result: 42

Or a more complicated example:

sscanf(Stdio.File("/proc/meminfo")->read(),"%{%s: %d%*s\n%}",array data);
mapping meminfo=(mapping)data;

That builds up a mapping (Pike terminology for what Python calls a
dict) with the important information out of /proc/meminfo, something
like this:

([
"MemTotal": 2026144,
"MemFree": 627652,
"Buffers": 183572,
"Cached": 380724,
..... etc etc
])

So, no. I haven't figured out that Perl's regular expressions
outperform Python's or Pike's or SciTE's, because I simply don't need
them all that much. With sscanf, I can at least get a partial match,
which tells me where to look for the problem.

ChrisA

You're showing by these examples what regular expressions mean to you.

Mats

Chris Angelico · Jul 10, 2013

You're obviously trying hard to be funny. It fails miserably.

Either that or it's funny only to other Australians.

ChrisA

Skip Montanaro · Jul 10, 2013

Either that or it's funny only to other Australians.

Or the Dutch.

S

Joshua Landau · Jul 10, 2013

Or the Dutch.

Or us Brits.

Skip Montanaro · Jul 10, 2013

Or us Brits.

Hells bells... It appears everyone found it funny except the trolls.

S

memilanuk · Jul 10, 2013

Or us Brits.

Or the Yanks...

Normally I kill-file threads like this pretty early on, but I have to
admit - I'm enjoying watching y'all play with the troll this time

Steven D'Aprano · Jul 10, 2013

Ahhh.... so this is pos, right? Telling the truth? Interesting.

Mats, I fear you have misunderstood. If the Python Secret Underground
existed, which it most certainly does not, it would absolutely not have
the power to censor people's emails or cut them off in the middle of

Joshua Landau · Jul 10, 2013

That's by design. We don't want to make the same mistake as Perl, where
every problem is solved by a regular expression:

http://neilk.net/blog/2000/06/01/abigails-regex-to-test-for-prime-numbers/

so we deliberately make regexes as slow as possible so that programmers
will look for a better way to solve their problem. If you check the
source code for the re engine, you'll find that for certain regexes, it
busy-waits for anything up to 30 seconds at a time, deliberately wasting
cycles.

I hate to sound like this but do you realise that this is exactly what
you're arguing for when saying that sum() shouldn't use "+="?

(There is no spite in the above sentence, but it sounds like there is.
There is however no way obvious to me to remove it without changing
the sentence's meaning.)

The same with Unicode. We hate French people,

And for good damn reason too. They're ruining our language, Ã¡ mon avis..

python-dev Summary for 2005-04-16 through 2005-04-30	7	May 16, 2005
In the Matter of Herb Schildt: a Detailed Analysis of "C: TheComplete Nonsense"	109	Apr 3, 2010
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	1	Feb 1, 2004
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Jan 12, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	May 1, 2007
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Mar 15, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Feb 1, 2008

Stack Overflow moderator â€œanimusonâ€

Joshua Landau

Robert Kern

Joshua Landau

Antoon Pardon

Joshua Landau

Steve Simmons

Joshua Landau

Skip Montanaro

Mats Peterson

Mats Peterson

Mats Peterson

Joshua Landau

Mats Peterson

Chris Angelico

Skip Montanaro

Joshua Landau

Skip Montanaro

memilanuk

Steven D'Aprano

Joshua Landau

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads