String concatenation performance

C

Cristian.Codorean

I was just reading a "Python Speed/Performance Tips" article on the
Python wiki
http://wiki.python.org/moin/PythonSpeed/PerformanceTips
and I got to the part that talks about string concatenation and that it
is faster when using join instead of += because of strings being
immutable. So I have tried it:

from time import time
t=time()

s='almfklasmfkmaskmkmasfkmkqemkmqeqw'
for x in range(40):
#s+= s[len(s)/2:]
s="".join((s,s[len(s)/2:]))

print 'duration', time() - t

And I get 1.55016708374 for the concatenation and 3.01116681099 for the
join. I have also tried to put the join outside but it is still a
little bit over 3.
I'm using Python 2.4.2, GCC 3.3.3 (SuSE Linux).

So what am I doing wrong ?
 
D

Duncan Booth

Cristian.Codorean said:
I was just reading a "Python Speed/Performance Tips" article on the
Python wiki
http://wiki.python.org/moin/PythonSpeed/PerformanceTips
and I got to the part that talks about string concatenation and that
it is faster when using join instead of += because of strings being
immutable.

.... snip ...
So what am I doing wrong ?

What you are doing wrong is failing to search the newsgroup to see if
anyone else has asked the same question within the last few days. In
particular see the thread titled 'which is better, string concatentation
or substitution?' message <[email protected]>

http://groups.google.co.uk/group/co...b94730bedb/06412fc21aa1fe8c?#06412fc21aa1fe8c
 
B

Ben Sizer

Cristian.Codorean said:
I was just reading a "Python Speed/Performance Tips" article on the
Python wiki
http://wiki.python.org/moin/PythonSpeed/PerformanceTips
and I got to the part that talks about string concatenation and that it
is faster when using join instead of += because of strings being
immutable.

The idea is that you call join() once rather than calling += many
times. You achieve this by placing all the strings you want
concatenating into a single list, not by calling join() with multiple
lists.
 
B

Bruno Desthuilliers

Cristian.Codorean a écrit :
I was just reading a "Python Speed/Performance Tips" article on the
Python wiki
http://wiki.python.org/moin/PythonSpeed/PerformanceTips
and I got to the part that talks about string concatenation and that it
is faster when using join instead of += because of strings being
immutable.

This is somewhat obsolete. String concatenation have been subject to
some optimization since 2.3.x (IIRC - else please someone correct me).
NB: this is only true for CPython.

But the "".join() idiom is, well, still idiomatic...
So I have tried it:

from time import time
t=time()

s='almfklasmfkmaskmkmasfkmkqemkmqeqw'
for x in range(40):
#s+= s[len(s)/2:]
s="".join((s,s[len(s)/2:]))

Lol...

I'm afraid you didn't get the idiom right. The point is to avoid useless
allocations in the loop body. The idiom is:

buf = []
for x in range(42):
buf.append(s)
s = "".join(buf)
print 'duration', time() - t

timeit may be a better choice for microbenchmarks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,076
Latest member
OrderKetoBeez

Latest Threads

Top