A reply for rusi (FSR)

jmfauth · Mar 13, 2013

As a reply to rusi's comment:
http://groups.google.com/group/comp.lang.python/browse_thread/thread/a7689b158fdca29e#

From string creation to the itertools usage. A medley. Some timings.

Important:
The real/absolute values of these experiments are not important. I do
not care and I'm not complaining at all.

These values are expected, I expected such values and they are only
confirming (*FOR ME*) my understanding of the coding of the characters
(and Unicode).

#~ py323 py330

#~ test 1: 0.015357737412819 0.019290216142579
#~ test 2: 0.015698801667198 0.020386269052436
#~ test 3: 0.015613338684288 0.018769561472500
#~ test 4: 0.023235297708529 0.032253414679390
#~ test 5: 0.023327062109534 0.029621391108935
#~ test 6: 1.119958127076760 1.095467665651482
#~ test 7: 0.420158472788311 0.565518010043673
#~ test 8: 0.649444234615974 1.061556978013171
#~ test 9: 0.712335144072079 1.211614222458175
#~ test 10: 0.704622996001357 1.160909074081441
#~ test 11: 0.614674584923621 1.053985430333688
#~ test 12: 0.660336235792764 1.059443246081010
#~ test 13: 4.821435927771570 5.795325214218677
#~ test 14: 0.494012668213403 0.729330462512273
#~ test 15: 0.504894429585788 0.879966255906103
#~ test 16: 0.693093370081103 1.132884304782264
#~ test 17: 0.749076743789461 3.013804437852462
#~ test 18: 7.467055989281286 13.387841650089342
#~ test 19: 7.581776062566778 13.593412812594643
#~ test 20: 9.477877493343140 15.235388291413805
#~ test 21: 0.022614608026196 0.020984116094176
#~ test 22: 6.685022041178975 12.687538276191944
#~ test 23: 6.946794763994170 12.986701250949636
#~ test 24: 0.097796827314760 0.156285014715777
#~ test 25: 0.024915807146677 0.034190706904894
#~ test 26: 0.024996544066013 0.032191582014335
#~ test 27: 0.000693943667684 0.001315421027272
#~ test 28: 0.000679765476967 0.001305968900141
#~ test 29: 0.001614344548152 0.025543979763000
#~ test 30: 0.000204008410812 0.000286714523313
#~ test 31: 0.000213460537964 0.000301286552656
#~ test 32: 0.000204008410819 0.000291440586878
#~ test 33: 0.249692904327539 0.497374474766957
#~ test 34: 0.248750448483740 0.513947598194790
#~ test 35: 0.099810130396032 0.249129715085319

jmf

rusi · Mar 13, 2013

As a reply to rusi's comment:http://groups.google.com/group/comp.lang.python/browse_thread/thread/...

From string creation to the itertools usage. A medley. Some timings.

Important:
The real/absolute values of these experiments are not important. I do
not care and I'm not complaining at all.

These values are expected, I expected such values and they are only
confirming (*FOR ME*) my understanding of the coding of the characters
(and Unicode).

#~ py323 py330

#~ test 1: 0.015357737412819 0.019290216142579
#~ test 2: 0.015698801667198 0.020386269052436
#~ test 3: 0.015613338684288 0.018769561472500
#~ test 4: 0.023235297708529 0.032253414679390
#~ test 5: 0.023327062109534 0.029621391108935
#~ test 6: 1.119958127076760 1.095467665651482
#~ test 7: 0.420158472788311 0.565518010043673
#~ test 8: 0.649444234615974 1.061556978013171
#~ test 9: 0.712335144072079 1.211614222458175
#~ test 10: 0.704622996001357 1.160909074081441
#~ test 11: 0.614674584923621 1.053985430333688
#~ test 12: 0.660336235792764 1.059443246081010
#~ test 13: 4.821435927771570 5.795325214218677
#~ test 14: 0.494012668213403 0.729330462512273
#~ test 15: 0.504894429585788 0.879966255906103
#~ test 16: 0.693093370081103 1.132884304782264
#~ test 17: 0.749076743789461 3.013804437852462
#~ test 18: 7.467055989281286 13.387841650089342
#~ test 19: 7.581776062566778 13.593412812594643
#~ test 20: 9.477877493343140 15.235388291413805
#~ test 21: 0.022614608026196 0.020984116094176
#~ test 22: 6.685022041178975 12.687538276191944
#~ test 23: 6.946794763994170 12.986701250949636
#~ test 24: 0.097796827314760 0.156285014715777
#~ test 25: 0.024915807146677 0.034190706904894
#~ test 26: 0.024996544066013 0.032191582014335
#~ test 27: 0.000693943667684 0.001315421027272
#~ test 28: 0.000679765476967 0.001305968900141
#~ test 29: 0.001614344548152 0.025543979763000
#~ test 30: 0.000204008410812 0.000286714523313
#~ test 31: 0.000213460537964 0.000301286552656
#~ test 32: 0.000204008410819 0.000291440586878
#~ test 33: 0.249692904327539 0.497374474766957
#~ test 34: 0.248750448483740 0.513947598194790
#~ test 35: 0.099810130396032 0.249129715085319

jmf

Thank you jmf. I believe that for the first time you have moved beyond
a single point of complaint to a swathe of data points which evidently
show performance regression. You would need to provide data of what
these tests 1-35 are.

rusi · Mar 13, 2013

Thank you jmf. I believe that for the first time you have moved beyond
a single point of complaint to a swathe of data points which evidently
show performance regression. You would need to provide data of what
these tests 1-35 are.

Uhhh..
Making the subject line useful for all readers

Chris Angelico · Mar 13, 2013

Uhhh..
Making the subject line useful for all readers

I should have read this one before replying in the other thread.

jmf, I'd like to see evidence that there has been a performance
regression compared against a wide build of Python 3.2. You still have
never answered this fundamental, that the narrow builds of Python are
*BUGGY* in the same way that JavaScript/ECMAScript is. And believe you
me, the utterly unnecessary hassles I have had to deal with when
permitting user-provided .js code to script my engine have wasted
rather more dev hours than you would believe - there are rather a lot
of stupid edge cases to deal with.

The PEP 393 string is simply a memory-optimized version of UTF-32. It
guarantees O(1) indexing and slicing, while still remaining tight in
many cases. Its worst case is a constant amount larger than pure
UTF-32 (the overhead of recording the string width), its best case is
equivalent to ASCII (if all strings are seven-bit).

The flexible string representation is not brand new. It has been
tested and proven in another language, one very similar to Python; and
its performance has been provably sufficient for everyday operations.
Pike's string type behaves just as Python 3.3's, and has done for
longer than I can trace backward. In terms of Unicode compliance, it
is perfect; in terms of performance, quite acceptable; the worst-case
operation is taking an ASCII string and overwriting one character in
it with an astral character - which Python flat-out doesn't permit,
but Pike does, as a known-slow operation. (It triggers a copy of the
string, so it's always going to be slow.)

There are two broad areas of complaint that you have raised. One is of
Unicode compliance and correctness. I believe those complaints are
utterly unfounded, and you have yet to show any serious evidence to
support them. Py 3.3 is perfectly compliant with everything I have yet
checked. The other complaint is of performance, and the issue of being
US-centric. While it's true that ASCII and Latin-1 strings will be
smaller/faster under Py 3.3 than 3.2, this is not purely to the
benefit of the US at the cost of everyone else; it's also a benefit to
the myriad non-US programs that use a lot of ASCII strings - for
instance, delimiters, HTML tags, builtin function names... all of
these are ASCII, even if the rest of the code isn't. And there's no
penalty for non-English speakers, when compared against a non-buggy
wide build. The very worst case is only a constant factor worse, and
that assumes astral characters in every single string... which does
not happen, trust me on that.

ChrisA

rusi · Mar 13, 2013

I should have read this one before replying in the other thread.

jmf, I'd like to see evidence that there has been a performance
regression compared against a wide build of Python 3.2. You still have
never answered this fundamental, that the narrow builds of Python are
*BUGGY* in the same way that JavaScript/ECMAScript is. And believe you
me, the utterly unnecessary hassles I have had to deal with when
permitting user-provided .js code to script my engine have wasted
rather more dev hours than you would believe - there are rather a lot
of stupid edge cases to deal with.

This assumes that there are only three choices:
- narrow build that is buggy (surrogate pairs for astral characters)
- wide build that is 4-fold space inefficient for wide variety of
common (ASCII) use-cases
- flexible string engine that chooses a small tradeoff of space
efficiency over time efficiency.

There is a fourth choice: narrow build that chooses to be partial over
being buggy. ie when an astral character is encountered, an exception
is thrown rather than trying to fudge it into a 16-bit
representation.

I am hardly a unicode expert, my impression is this: While in today's
internationalized world, going back to ASCII is not an option, most
actual uses of unicode stay within the BMP

Further if the choice is not between two python executables but
between string-engines chosen at startup by command-line switches or
equivalent, the price may be quite small.

Thomas 'PointedEars' Lahn · Mar 13, 2013

Chris said:
I should have read this one before replying in the other thread.

jmf, I'd like to see evidence that there has been a performance
regression compared against a wide build of Python 3.2. You still have
never answered this fundamental, that the narrow builds of Python are
*BUGGY* in the same way that JavaScript/ECMAScript is.

Interesting. From my work I was under the impression that I knew ECMAScript
and its implementations fairly well, yet I have never heard of this before.

What do you mean by â€œnarrow buildâ€ and â€œwide buildâ€ and what exactly is the
bug â€œnarrow buildsâ€ of Python 3.2 have in common with JavaScript/ECMAScript?
To which implementation of ECMAScript are you referring â€“ or are you
referring to the Specification as such?

Chris Angelico · Mar 13, 2013

This assumes that there are only three choices:
- narrow build that is buggy (surrogate pairs for astral characters)
- wide build that is 4-fold space inefficient for wide variety of
common (ASCII) use-cases
- flexible string engine that chooses a small tradeoff of space
efficiency over time efficiency.

There is a fourth choice: narrow build that chooses to be partial over
being buggy. ie when an astral character is encountered, an exception
is thrown rather than trying to fudge it into a 16-bit
representation.

As a simple factual matter, narrow builds of Python 3.2 don't do that.
So it doesn't factor into my original statement. But if you're talking
about a proposal for 3.4, then sure, that's a theoretical possibility.
It wouldn't be "buggy" in the sense of "string indexing/slicing
unexpectedly does the wrong thing", but it would still be incomplete
Unicode support, and I don't think people would appreciate it. Much
better to have graceful degradation: if there are non-BMP characters
in the string, then instead of throwing an exception, it just makes
the string wider.

I am hardly a unicode expert, my impression is this: While in today's
internationalized world, going back to ASCII is not an option, most
actual uses of unicode stay within the BMP

That's a valid line of argument for an optimization, but not for a
hard limitation. A general-purpose language, function, system,
whatever, will need to cope with astral characters at some point; it
just won't need them *often*.

Further if the choice is not between two python executables but
between string-engines chosen at startup by command-line switches or
equivalent, the price may be quite small.

It's complexity cost, though, and people would need to know when it
would be worth giving Python that switch to change its string format.
Plus, every C extension would need to cope with both formats. I
personally doubt it'd be worth it, but if you want to knock together a
patched CPython and get some timing stats, I'm sure this list or
python-dev will be happy to discuss the matter.

ChrisA

Chris Angelico · Mar 14, 2013

Interesting. From my work I was under the impression that I knew ECMAScript
and its implementations fairly well, yet I have never heard of this before.

What do you mean by “narrow build” and “wide build” and what exactly is the
bug “narrow builds” of Python 3.2 have in common with JavaScript/ECMAScript?
To which implementation of ECMAScript are you referring – or are you
referring to the Specification as such?

The ECMAScript spec says that strings are stored and represented in
UTF-16. Python versions up to 3.2 came in two varieties: narrow, which
included (I believe) the Windows builds available on python.org, and
wide, which was (again, I think) the default Linux config. The problem
predates Python 3 and its default string being Unicode - the Py2
unicode type has the same issue:

Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit
(Intel)] on win322

Python 2.6.6 (r266:84292, Sep 15 2010, 15:52:39)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.1

That's the Python msi installer, and the default system Python from an
Ubuntu 10.10. The exact same code does different things on different
platforms, and on the Windows (narrow-build), it's possible to split
surrogates:

u"\U00012345"[0] u'\ud808'
u"\U00012345"[1]

Click to expand...

Click to expand...

u'\udf45'

You can see the same thing in Javascript too. Here's a little demo I
just knocked together:

<script>
function foo()
{
var txt=document.getElementById("in").value;
var msg="";
for (var i=0;i<txt.length;++i) msg+="["+i+"]: "+txt.charCodeAt(i)+"
"+txt.charCodeAt(i).toString(16)+"\n";
document.getElementById("out").value=msg;
}
</script>
<input id=in><input type=button onclick="foo()"
value="Show"><br><textarea id=out rows=25 cols=80></textarea>

Give it an ASCII string and you'll see, as expected, one index (based
on string indexing or charCodeAt, same thing) for each character. Same
if it's all BMP. But put an astral character in and you'll see
00.00.d8.00/24 (oh wait, CIDR notation doesn't work in Unicode) come
up. I raised this issue on the Google V8 list and on the ECMAScript
list (e-mail address removed), and was basically told that since
JavaScript has been buggy for so long, there's no chance of ever
making it bug-free:

https://mail.mozilla.org/pipermail/es-discuss/2012-December/027384.html

Fortunately for Python, there are version numbers, and policies that
permit bugs to actually get fixed. (Which is why, for instance, Debian
Squeeze still ships Python 2.6 rather than upgrading to 2.7 - in case
some script is broken by that change. Can't do that with web
browsers.) As of Python 3.3, all Pythons function the same way: it's
semantically a "wide build" (UTF-32), but with a memory usage
optimization. That's how it needs to be.

ChrisA

MRAB · Mar 14, 2013

As a simple factual matter, narrow builds of Python 3.2 don't do that.
So it doesn't factor into my original statement. But if you're talking
about a proposal for 3.4, then sure, that's a theoretical possibility.
It wouldn't be "buggy" in the sense of "string indexing/slicing
unexpectedly does the wrong thing", but it would still be incomplete
Unicode support, and I don't think people would appreciate it. Much
better to have graceful degradation: if there are non-BMP characters
in the string, then instead of throwing an exception, it just makes
the string wider.

[snip]
Do you mean that instead of switching between 1/2/4 bytes per codepoint
it would switch between 2/4 bytes per codepoint?

Chris Angelico · Mar 14, 2013

As a simple factual matter, narrow builds of Python 3.2 don't do that.
So it doesn't factor into my original statement. But if you're talking
about a proposal for 3.4, then sure, that's a theoretical possibility.
It wouldn't be "buggy" in the sense of "string indexing/slicing
unexpectedly does the wrong thing", but it would still be incomplete
Unicode support, and I don't think people would appreciate it. Much
better to have graceful degradation: if there are non-BMP characters
in the string, then instead of throwing an exception, it just makes
the string wider.

Click to expand...

[snip]
Do you mean that instead of switching between 1/2/4 bytes per codepoint
it would switch between 2/4 bytes per codepoint?

That's my point. We already have the better version.

ChrisA

MRAB · Mar 14, 2013

Uhhh..
Making the subject line useful for all readers

I should have read this one before replying in the other thread.

jmf, I'd like to see evidence that there has been a performance
regression compared against a wide build of Python 3.2. You still have
never answered this fundamental, that the narrow builds of Python are
*BUGGY* in the same way that JavaScript/ECMAScript is. And believe you
me, the utterly unnecessary hassles I have had to deal with when
permitting user-provided .js code to script my engine have wasted
rather more dev hours than you would believe - there are rather a lot
of stupid edge cases to deal with.

This assumes that there are only three choices:
- narrow build that is buggy (surrogate pairs for astral characters)
- wide build that is 4-fold space inefficient for wide variety of
common (ASCII) use-cases
- flexible string engine that chooses a small tradeoff of space
efficiency over time efficiency.

There is a fourth choice: narrow build that chooses to be partial over
being buggy. ie when an astral character is encountered, an exception
is thrown rather than trying to fudge it into a 16-bit
representation.

As a simple factual matter, narrow builds of Python 3.2 don't do that.
So it doesn't factor into my original statement. But if you're talking
about a proposal for 3.4, then sure, that's a theoretical possibility.
It wouldn't be "buggy" in the sense of "string indexing/slicing
unexpectedly does the wrong thing", but it would still be incomplete
Unicode support, and I don't think people would appreciate it. Much
better to have graceful degradation: if there are non-BMP characters
in the string, then instead of throwing an exception, it just makes
the string wider.

Click to expand...

[snip]
Do you mean that instead of switching between 1/2/4 bytes per codepoint
it would switch between 2/4 bytes per codepoint?

Click to expand...

That's my point. We already have the better version.

If a later version of Python switched between 2/4 bytes per codepoint,
how much difference would it make in terms of memory and speed compared
to Python 3.2 (fixed width) and Python 3.3 (3 widths)?

The vast majority of the time, 2 bytes per codepoint is sufficient, but
would that result in less switching between widths and therefore higher
performance, or would the use of more memory (2 bytes when 1 byte would
do) offset that?

(And I'm talking about significant differences here.)

Terry Reedy · Mar 14, 2013

Wrong. Python almost certainly runs faster with the new string
representation. This has been explained previously more than once.

This is what tcl/tk does, and it is a dammed nuisance. Completely
unacceptible for Python's string type.
....

It's complexity cost, though, and people would need to know when it
would be worth giving Python that switch to change its string format.
Plus, every C extension would need to cope with both formats. I
personally doubt it'd be worth it, but if you want to knock together a
patched CPython and get some timing stats, I'm sure this list or
python-dev will be happy to discuss the matter.

I presume the smiley indicates that you know that python developers are
too busy with real problems to have any interest in bogus solutions to
bogus problems.

Steven D'Aprano · Mar 14, 2013

On 13/03/2013 23:43, Chris Angelico wrote:

Uhhh..
Making the subject line useful for all readers

I should have read this one before replying in the other thread.

jmf, I'd like to see evidence that there has been a performance
regression compared against a wide build of Python 3.2. You still
have never answered this fundamental, that the narrow builds of
Python are *BUGGY* in the same way that JavaScript/ECMAScript is.
And believe you me, the utterly unnecessary hassles I have had to
deal with when permitting user-provided .js code to script my
engine have wasted rather more dev hours than you would believe -
there are rather a lot of stupid edge cases to deal with.

This assumes that there are only three choices: - narrow build that
is buggy (surrogate pairs for astral characters) - wide build that
is 4-fold space inefficient for wide variety of common (ASCII)
use-cases
- flexible string engine that chooses a small tradeoff of space
efficiency over time efficiency.

There is a fourth choice: narrow build that chooses to be partial
over being buggy. ie when an astral character is encountered, an
exception is thrown rather than trying to fudge it into a 16-bit
representation.

As a simple factual matter, narrow builds of Python 3.2 don't do
that. So it doesn't factor into my original statement. But if you're
talking about a proposal for 3.4, then sure, that's a theoretical
possibility. It wouldn't be "buggy" in the sense of "string
indexing/slicing unexpectedly does the wrong thing", but it would
still be incomplete Unicode support, and I don't think people would
appreciate it. Much better to have graceful degradation: if there are
non-BMP characters in the string, then instead of throwing an
exception, it just makes the string wider.

[snip]
Do you mean that instead of switching between 1/2/4 bytes per
codepoint it would switch between 2/4 bytes per codepoint?

Click to expand...

That's my point. We already have the better version.

Click to expand...

If a later version of Python switched between 2/4 bytes per codepoint,
how much difference would it make in terms of memory and speed compared
to Python 3.2 (fixed width) and Python 3.3 (3 widths)?

The vast majority of the time, 2 bytes per codepoint is sufficient, but
would that result in less switching between widths and therefore higher
performance, or would the use of more memory (2 bytes when 1 byte would
do) offset that?

(And I'm talking about significant differences here.)

That depends on how you use the strings. Because strings are immutable,
there isn't really anything like "switching between widths" -- the width
is set when the string is created, and then remains fixed.

It is true that when you create a string, Python sometimes has to do some
work to determine what width it needs, but that's effectively a fixed-
cost per string. It's relatively trivial compared to the cost of other
string operations, but it is a real cost. If all you do is create the
strings then throw them away, as JMF tends to do in his benchmarks, you
repeatedly pay the cost without seeing the benefit.

On the other hand, Python is *full* of large numbers of ASCII strings,
and many users use lots of Latin1 strings. Both of these save significant
amounts of memory: almost 50% of what they would otherwise use in a
narrow build, and almost 75% in a wide build.

This memory saving has real consequences, performance-wise. Python's
memory management can be more efficient, since objects in the heap are
smaller. I'm not sure if objects ever move in the heap (I think Java's
memory manager does move objects around, hence Jython will do so, but I'm
not sure about CPython), but even if they don't, its obviously faster to
allocate a certain sized block of memory the more free memory you have,
and you'll have more free memory if any pre-existing objects in the heap
are smaller.

I expect that traversing a block of memory byte-by-byte may be faster
than traversing it 2x or 4x bytes at a time. My testing suggests that
iterating over a 1-byte width string is about three times faster than
iterating over a 2-byte or 4-byte wide string. But that may depend on
your OS and hardware.

Finally, there may be CPU effects, to do with how quickly strings can be
passed through the CPU pipelines, whether data is found in the CPU cache
or not, etc. Obviously this too will depend on the size of the strings.
You can squeeze 1K of data through the CPU faster than 4K of data.

In practice, how much of an effect will this have? It's hard to say
without testing, but indications with real-world applications indicate
that Python 3.3 not only saves significant memory over 3.2 narrow builds,
but for real-world code, it can often be a little faster as well.

Chris Angelico · Mar 14, 2013

I presume the smiley indicates that you know that python developers are too
busy with real problems to have any interest in bogus solutions to bogus
problems.

It indicates more that the list(s) would almost certainly open up with
quite a bit of discussion - especially this one. It's not hard to get
talk happening, as evidenced by the number of times we've already
discussed this very topic. Frankly, I doubt there'll be anything to
discuss - that the patched version will be consistently worse; but if
I've learned one thing about timings, it's that there are surprises
*everywhere*, so I'm not prepared to state categorically that it
*cannot* be better. (I will, however, state that I do not expect any
such improvement to be worth the trouble of writing it.)

ChrisA

Chris Angelico · Mar 14, 2013

That depends on how you use the strings. Because strings are immutable,
there isn't really anything like "switching between widths" -- the width
is set when the string is created, and then remains fixed.

The nearest thing to "switching" is where you repeatedly replace() or
append/slice to add/remove the one non-ASCII character that your
contrived test is using. Let's see...

Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600
32 bit (Intel)] on win32

ASCII -> ASCII:

timeit.timeit("s=s[:-1]+'\u0034'","s='asdf'*10000",number=10000)

Click to expand...

Click to expand...

0.14999895238081962

ASCII -> BMP:

timeit.timeit("s=s[:-1]+'\u1234'","s='asdf'*10000",number=10000)

Click to expand...

Click to expand...

1.7513426985832012

BMP -> BMP:

timeit.timeit("s=s[:-1]+'\u1234'","s='\u1234sdf'*10000",number=10000)

Click to expand...

Click to expand...

0.22562895563542895

ASCII -> SMP:

timeit.timeit("s=s[:-1]+'\U00012345'","s='asdf'*10000",number=10000)

Click to expand...

Click to expand...

1.9037101084076369

BMP -> SMP:

timeit.timeit("s=s[:-1]+'\U00012345'","s='\u1234sdf'*10000",number=10000)

Click to expand...

Click to expand...

1.9659967956821163

SMP -> SMP:

timeit.timeit("s=s[:-1]+'\U00012345'","s='\U00012345sdf'*10000",number=10000)

Click to expand...

Click to expand...

0.7214749360603037

So there *is* cost to "changing size". Trying them again in Python 2.6 Narrow:

Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit
(Intel)] on win32

ASCII -> ASCII:

timeit.timeit("s=s[:-1]+u'\u0034'","s=u'asdf'*10000",number=10000)

Click to expand...

Click to expand...

0.53506213778566547

ASCII -> BMP:

timeit.timeit("s=s[:-1]+u'\u1234'","s=u'asdf'*10000",number=10000)

Click to expand...

Click to expand...

0.57752172412974268

BMP -> BMP:

timeit.timeit("s=s[:-1]+u'\u1234'","s=u'\u1234sdf'*10000",number=10000)

Click to expand...

Click to expand...

0.53309121690045913

ASCII -> SMP:

timeit.timeit("s=s[:-1]+u'\U00012345'","s=u'asdf'*10000",number=10000)

Click to expand...

Click to expand...

0.55128347317885584

BMP -> SMP:

timeit.timeit("s=s[:-1]+u'\U00012345'","s=u'\u1234sdf'*10000",number=10000)

Click to expand...

Click to expand...

0.55610140394938412

SMP -> SMP:

timeit.timeit("s=s[:-1]+u'\U00012345'","s=u'\U00012345sdf'*10000",number=10000)

Click to expand...

Click to expand...

0.6599570615818493

Much more consistent. (Note that the SMP timings are quite probably a
bit off as the string will continue to grow - I'm taking off one
16-bit character and putting on two.)

I don't have a 2.6 wide build on the same hardware, so these times
don't truly compare to the above ones. This is slower hardware than
the above tests.

Python 2.6.6 (r266:84292, Sep 15 2010, 15:52:39)
[GCC 4.4.5] on linux2

timeit.timeit("s=s[:-1]+u'\u0034'","s=u'asdf'*10000",number=10000) 1.5774970054626465
timeit.timeit("s=s[:-1]+u'\u1234'","s=u'asdf'*10000",number=10000) 1.5743560791015625
timeit.timeit("s=s[:-1]+u'\u1234'","s=u'\u1234sdf'*10000",number=10000) 1.6072981357574463
timeit.timeit("s=s[:-1]+u'\U00012345'","s=u'asdf'*10000",number=10000) 1.6745591163635254
timeit.timeit("s=s[:-1]+u'\U00012345'","s=u'\u1234sdf'*10000",number=10000) 1.6705770492553711
timeit.timeit("s=s[:-1]+u'\U00012345'","s=u'\U00012345sdf'*10000",number=10000)

Click to expand...

Click to expand...

1.7078530788421631

Here's my reading of all these stats. Python 3.3's str is faster than
2.6's unicode when the copy can be done directly (ie when the size
isn't changing), but converting sizes costs a lot (suggestion: memcpy
is blazingly fast, no surprise there). Since MOST string operations
won't change the size, this is a benefit to most programs.

I expect that Python 3.2 will behave comparably to the 2.6 stats, but
I don't have 3.2s handy - can someone confirm please?

ChrisA

rusi · Mar 14, 2013

On Mar 14 said:
I expect that Python 3.2 will behave comparably to the 2.6 stats, but
I don't have 3.2s handy - can someone confirm please?

I have 3.2 but not 3.3. Can run it later today if no one does.
But better if someone with both on the same machine do the comparison.

jmf will you please run Chris' examples on all your pythons?

Terry Reedy · Mar 14, 2013

I have 3.2 but not 3.3. Can run it later today if no one does.
But better if someone with both on the same machine do the comparison.

The python devs use the microbenchmarks in
Tools/stringbench/stringbench.py, which covers all string operations, as
the basis for improving particular string functions. Overall, Unicode is
nearly as fast as bytes and 3.3 as fast as 3.2. Find/replace is the
notable exception in stringbench, so it is an anomaly. Other things are
faster in 3.3. In selecting the new implementation, the devs also
considered space and speed gains that do not show up in microbenchmarks.

Terry Reedy · Mar 15, 2013

The python devs use the microbenchmarks in
Tools/stringbench/stringbench.py, which covers all string operations, as
the basis for improving particular string functions. Overall, Unicode is
nearly as fast as bytes and 3.3 as fast as 3.2. Find/replace is the
notable exception in stringbench, so it is an anomaly. Other things are
faster in 3.3. In selecting the new implementation, the devs also
considered space and speed gains that do not show up in microbenchmarks.

Links to the readme and code for stringbench can be found here:
http://hg.python.org/cpython/file/c25bc2587c48/Tools/stringbench

rusi · Mar 15, 2013

3.2 and 2.7 results on my desktop using Chris examples
(Hope I cut-pasted them correctly)
-----------------------------
Welcome to the Emacs shell

~ $ python3
Python 3.2.3 (default, Feb 20 2013, 17:02:41)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.

from timeit import timeit
timeit("s=s[:-1]+'\u0034'","s='asdf'*10000",number=10000) 0.2893378734588623
timeit("s=s[:-1]+'\u1234'","s='asdf'*10000",number=10000)

Click to expand...

Click to expand...

0.2842249870300293

timeit("s=s[:-1]+'\u1234'","s='\u1234sdf'*10000",number=10000) 0.28406381607055664
timeit("s=s[:-1]+'\U00012345'","s='asdf'*10000",number=10000) 0.28420209884643555
timeit("s=s[:-1]+'\U00012345'","s='\u1234sdf'*10000",number=10000) 0.2853250503540039
timeit("s=s[:-1]+'\U00012345'","s='\U00012345sdf'*10000",number=10000) 0.283905029296875

Click to expand...

Click to expand...

~ $ python
Python 2.7.3 (default, Jan 2 2013, 16:53:07)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.

from timeit import timeit
timeit("s=s[:-1]+'\u0034'","s='asdf'*10000",number=10000) 0.20418286323547363
timeit("s=s[:-1]+'\u1234'","s='asdf'*10000",number=10000)

Click to expand...

Click to expand...

0.20579099655151367

timeit("s=s[:-1]+'\u1234'","s='\u1234sdf'*10000",number=10000) 0.5055279731750488
timeit("s=s[:-1]+'\U00012345'","s='asdf'*10000",number=10000) 0.28449511528015137
timeit("s=s[:-1]+'\U00012345'","s='\u1234sdf'*10000",number=10000) 0.6001529693603516
timeit("s=s[:-1]+'\U00012345'","s='\U00012345sdf'*10000",number=10000)

Click to expand...

Click to expand...

0.8430721759796143

Andriy Kornatskyy · Mar 15, 2013

$ python3.2
Python 3.2.3 (default, Jun 25 2012, 22:55:05)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.

from timeit import repeat
repeat("s=s[:-1]+'\u0034'","s='asdf'*10000",number=10000) [0.2566258907318115, 0.14485502243041992, 0.14464998245239258]
repeat("s=s[:-1]+'\u1234'","s='asdf'*10000",number=10000) [0.25584888458251953, 0.1340939998626709, 0.1338820457458496]
repeat("s=s[:-1]+'\u1234'","s='\u1234sdf'*10000",number=10000) [0.2571289539337158, 0.13403892517089844, 0.13388800621032715]
repeat("s=s[:-1]+'\U00012345'","s='asdf'*10000",number=10000) [0.5022759437561035, 0.3970041275024414, 0.3764481544494629]
repeat("s=s[:-1]+'\U00012345'","s='\u1234sdf'*10000",number=10000) [0.5213770866394043, 0.38585615158081055, 0.40251588821411133]
repeat("s=s[:-1]+'\U00012345'","s='\U00012345sdf'*10000",number=10000)

Click to expand...

Click to expand...

[0.768744945526123, 0.5852570533752441, 0.6029140949249268]

$ python3.3
Python 3.3.0 (default, Sep 29 2012, 15:35:49)
[GCC 4.7.1] on linux
Type "help", "copyright", "credits" or "license" for more information.

from timeit import repeat
repeat("s=s[:-1]+'\u0034'","s='asdf'*10000",number=10000) [0.0985728640225716, 0.0984080360212829, 0.07457763599813916]
repeat("s=s[:-1]+'\u1234'","s='asdf'*10000",number=10000) [0.901988381985575, 0.7517840950167738, 0.7540924890199676]
repeat("s=s[:-1]+'\u1234'","s='\u1234sdf'*10000",number=10000) [0.3069786810083315, 0.17701858800137416, 0.1769046070112381]
repeat("s=s[:-1]+'\U00012345'","s='asdf'*10000",number=10000) [1.081760977016529, 0.9099628589756321, 0.9926943230093457]
repeat("s=s[:-1]+'\U00012345'","s='\u1234sdf'*10000",number=10000) [1.2101859120011795, 1.1039280130062252, 0.9306247030035593]
repeat("s=s[:-1]+'\U00012345'","s='\U00012345sdf'*10000",number=10000)

Click to expand...

Click to expand...

[0.4759294819959905, 0.35435649199644104, 0.3540659479913302]

----------------------------------------

Date: Fri, 15 Mar 2013 10:07:48 -0700
Subject: Re: String performance regression from python 3.2 to 3.3
From: (e-mail address removed)
To: (e-mail address removed)

3.2 and 2.7 results on my desktop using Chris examples
(Hope I cut-pasted them correctly)
-----------------------------
Welcome to the Emacs shell

~ $ python3
Python 3.2.3 (default, Feb 20 2013, 17:02:41)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information..

from timeit import timeit
timeit("s=s[:-1]+'\u0034'","s='asdf'*10000",number=10000) 0.2893378734588623
timeit("s=s[:-1]+'\u1234'","s='asdf'*10000",number=10000)

Click to expand...

0.2842249870300293

timeit("s=s[:-1]+'\u1234'","s='\u1234sdf'*10000",number=10000) 0.28406381607055664
timeit("s=s[:-1]+'\U00012345'","s='asdf'*10000",number=10000) 0.28420209884643555
timeit("s=s[:-1]+'\U00012345'","s='\u1234sdf'*10000",number=10000) 0.2853250503540039
timeit("s=s[:-1]+'\U00012345'","s='\U00012345sdf'*10000",number=10000) 0.283905029296875

Click to expand...

Click to expand...

~ $ python
Python 2.7.3 (default, Jan 2 2013, 16:53:07)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information..

from timeit import timeit
timeit("s=s[:-1]+'\u0034'","s='asdf'*10000",number=10000) 0.20418286323547363
timeit("s=s[:-1]+'\u1234'","s='asdf'*10000",number=10000)

Click to expand...

0.20579099655151367

timeit("s=s[:-1]+'\u1234'","s='\u1234sdf'*10000",number=10000) 0.5055279731750488
timeit("s=s[:-1]+'\U00012345'","s='asdf'*10000",number=10000) 0.28449511528015137
timeit("s=s[:-1]+'\U00012345'","s='\u1234sdf'*10000",number=10000) 0.6001529693603516
timeit("s=s[:-1]+'\U00012345'","s='\U00012345sdf'*10000",number=10000)

Click to expand...

Click to expand...

0.8430721759796143

Taskcproblem calendar	4	Aug 31, 2023
Strange behavior for a 2D list	1	Apr 18, 2013
When I send email as HTML, why do erroneous whitespaces getintroduced to the HTML source and a few <	2	Nov 8, 2013
Pickling over a socket	13	Apr 19, 2011
Problem with heap	4	May 31, 2004
Engineering a List container Part 2: Implementations	20	Dec 8, 2013
Occurence problem: different ideas	3	May 7, 2006
Training Program for the IIBA™ CBAP™ Certification examination (BABOK®-Ver 1.6) by CBAP Certified fa	0	Oct 20, 2008

A reply for rusi (FSR)

jmfauth

rusi

rusi

Chris Angelico

rusi

Thomas 'PointedEars' Lahn

Chris Angelico

Chris Angelico

MRAB

Chris Angelico

MRAB

Terry Reedy

Steven D'Aprano

Chris Angelico

Chris Angelico

rusi

Terry Reedy

Terry Reedy

rusi

Andriy Kornatskyy

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads