How can I speed this function up?

Chris · Nov 18, 2006

This is just some dummy code to mimic what's being done in the real
code. The actual code is python which is used as a scripting language in
a third party app. The data structure returned by the app is more or
less like the "data" list in the code below. The test for "ELEMENT" is
necessary ... it just evaluates to true every time in this test code. In
the real app perhaps 90% of tests will also be true.

So my question is how can I speed up what's happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).

I have a vested interest in showing a colleague that a python app can
yield results in a time comparable to his C-app, which he feels is mch
faster. I'd like to know what I can do within the constraints of the
python language to get the best speed possible. Hope someone can help.

def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

import timeit

# basic data mimicing data returned from 3rd party app
data = []
for i in range(500000):
data.append(("ELEMENT", i, (1,2,3,4,5,6)))

# write data out to file
fname = "test2.txt"
out = open(fname,'w')
start= timeit.time.clock()
write_data2(out, data)
out.close()
print timeit.time.clock()-start

Terry Reedy · Nov 18, 2006

Chris said:
def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':

Testing for equality with 'is' is a bit of a cheat since it is
implementation dependent,
but since you have a somewhat unfair constraint ....

out.write("%s %06d " % (i[0], i[1]))

Since i[0] is tested to be "ELEMENT', this should be the same as
out.write("ELEMENT %06d " % i[1])
which saves constructing a tuple as well as an interpolation.

for j in i[2]:
out.write("%d " % (j))
out.write("\n")

tjr

Chris · Nov 18, 2006

Chris said:
This is just some dummy code to mimic what's being done in the real
code. The actual code is python which is used as a scripting language in
a third party app. The data structure returned by the app is more or
less like the "data" list in the code below. The test for "ELEMENT" is
necessary ... it just evaluates to true every time in this test code. In
the real app perhaps 90% of tests will also be true.

So my question is how can I speed up what's happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).

I have a vested interest in showing a colleague that a python app can
yield results in a time comparable to his C-app, which he feels is mch
faster. I'd like to know what I can do within the constraints of the
python language to get the best speed possible. Hope someone can help.

def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

import timeit

# basic data mimicing data returned from 3rd party app
data = []
for i in range(500000):
data.append(("ELEMENT", i, (1,2,3,4,5,6)))

# write data out to file
fname = "test2.txt"
out = open(fname,'w')
start= timeit.time.clock()
write_data2(out, data)
out.close()
print timeit.time.clock()-start

with this function I went from 8.04 s to 6.61 s. Now running up against
my limited knowledge of python. Any chance of getting faster?

def write_data4(out, data):
for i in data:
if i[0] is 'ELEMENT':
strx = "%s %06d " % (i[0], i[1])
strx="".join([strx] + ["%d " % (j) for j in i[2]] + "\n"])
out.write(strx)

Guest · Nov 18, 2006

Hi, Chris.
I made a trivial testing framework for this cute problem and tried a
couple of modifications. I also added the 10% of non-ELEMENT lines you
mentioned. First thing, your updated algorithm didn't really get me much
faster results than the original. I guess that my disk array sort of
hides the multiple write penalty. But I experimented with various
algorithms. Here's the code in its entirety:
http://www.rafb.net/paste/results/ZuW4fK85.html My results (Python 2.4,
32bit Fedora Core) were:

[ksh@lapoire tmp]# python test.py
Preparing data...
[write_data1] Preparing output file...
[write_data1] Writing...
[write_data1] Done in 10.73 seconds.
[write_data4] Preparing output file...
[write_data4] Writing...
[write_data4] Done in 10.46 seconds.
[write_data_flush] Preparing output file...
[write_data_flush] Writing...
[write_data_flush] Done in 9.09 seconds.
[write_data_per_line] Preparing output file...
[write_data_per_line] Writing...
[write_data_per_line] Done in 9.71 seconds.
[write_data_once] Preparing output file...
[write_data_once] Writing...
[write_data_once] Done in 7.82 seconds.

I'm pretty sure that your measures will vary (observing your results
you seem to have a faster CPU but slower disk(s)). But you can just take
what works best for you. I'm also quite confident that you won't be able
to catch up C since as you can see Python's data structures are far more
flexible and thus require more processing overhead.

Regards,
Åukasz

Gabriel Genellina · Nov 18, 2006

This is just some dummy code to mimic what's being done in the real
code. The actual code is python which is used as a scripting language in
a third party app. The data structure returned by the app is more or
less like the "data" list in the code below. The test for "ELEMENT" is
necessary ... it just evaluates to true every time in this test code. In
the real app perhaps 90% of tests will also be true.

So my question is how can I speed up what's happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).

I have a vested interest in showing a colleague that a python app can
yield results in a time comparable to his C-app, which he feels is mch
faster. I'd like to know what I can do within the constraints of the
python language to get the best speed possible. Hope someone can help.

If you can assume that all items have 6 numbers, it appears best to
unroll the inner iteration. Below is my best attempt with ideas from
other replies too, including some alternatives. The timing is only
approximate and had a wide dispersion; median of three. But it's
clear that the main gain comes from calling out.write only once:

Notice that you can't, in general, use i[0] is 'ELEMENT' unless you
can guarantee that i[0] is an interned string (and if it comes from
another process, chances are it isn't). Using intern(i[0]) is
'ELEMENT' would work, but slows down your program.

# initial: 11.66s
def write_data1(out, data):
write = out.write
for i in data:
if i[0] == 'ELEMENT': # sorry but can't guarantee identity

# 6.21s
write("ELEMENT %06d %s\n" % (i[1], "%d %d %d %d %d %d " % i[2]))

# 6.92s
# write("ELEMENT %06d %s \n" % (i[1], " ".join(map(str,i[2]))))

# 8.30s
# i2 = i[2]
# write("ELEMENT %06d %d %d %d %d %d %d \n" % (i[1],
i2[0], i2[1], i2[2], i2[3], i2[4], i2[5]))

# 7.04s __getitem__
# i2 = i[2].__getitem__
# write("ELEMENT %06d %d %d %d %d %d %d \n" % (i[1],
i2(0), i2(1), i2(2), i2(3), i2(4), i2(5)))

--
Gabriel Genellina
Softlab SRL

__________________________________________________
Correo Yahoo!
Espacio para todos tus mensajes, antivirus y antispam ¡gratis!
¡Abrí tu cuenta ya! - http://correo.yahoo.com.ar

Tim Hochberg · Nov 18, 2006

Chris said:
This is just some dummy code to mimic what's being done in the real
code. The actual code is python which is used as a scripting language in
a third party app. The data structure returned by the app is more or
less like the "data" list in the code below. The test for "ELEMENT" is
necessary ... it just evaluates to true every time in this test code. In
the real app perhaps 90% of tests will also be true.

So my question is how can I speed up what's happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).

Try collecting your output into bigger chunks before writing it out. For
example, take a look at:

def write_data2(out, data):
buffer = []
append = buffer.append
extend = buffer.extend
for i in data:
if i[0] == 'ELEMENT':
append("ELEMENT %06d " % i[1])
extend(map(str, i[2]))
append('\n')
out.write(''.join(buffer))

def write_data3(out, data):
buffer = []
append = buffer.append
for i in data:
if i[0] == 'ELEMENT':
append(("ELEMENT %06d %s" % (i[1],' '.join(map(str,i[2])))))
out.write('\n'.join(buffer))

Both of these run almost twice as fast as the original below (although
admittedly I didn't check that they were actually right). Using some of
the other suggestions mentioned in this thread may make things better
still. It's possible that some intermediate chunk size might be better
than collecting everything into one string, I dunno.

cStringIO might be helpful here as a buffer instead of using lists, but
I don't have time to try it right now.

-tim

I have a vested interest in showing a colleague that a python app can
yield results in a time comparable to his C-app, which he feels is mch
faster. I'd like to know what I can do within the constraints of the
python language to get the best speed possible. Hope someone can help.

def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

import timeit

# basic data mimicing data returned from 3rd party app
data = []
for i in range(500000):
data.append(("ELEMENT", i, (1,2,3,4,5,6)))

# write data out to file
fname = "test2.txt"
out = open(fname,'w')
start= timeit.time.clock()
write_data2(out, data)
out.close()
print timeit.time.clock()-start

Paddy · Nov 18, 2006

Chris said:
I have a vested interest in showing a colleague that a python app can
yield results in a time comparable to his C-app, which he feels is mch
faster. I'd like to know what I can do within the constraints of the
python language to get the best speed possible. Hope someone can help.

Fight smart!
How long did the C-app take to write?
How robust are the C and the Python versions w.r.t. unforeseen inputs?
Mimic the software life-cycle:
* How long would it take to make each program work on Windows?, Mac?
* How long would it take to 'fully' test each program?
How easy is it to explain each prog. to an audience that have
programmed, but never in C or Python?
How long would it take to add another feature?

Best and best speed can have many meanings. good luck.

- Paddy.

John Machin · Nov 18, 2006

with this function I went from 8.04 s to 6.61 s.

And your code became less understandable.

Now running up against
my limited knowledge of python. Any chance of getting faster?

You have saved 1.4 *seconds*. What is the normal running time for this
app with 0.5M records? What is 1.4 seconds as a percentage of that?

Please consider that you are barking up the wrong gum tree. Competing
with a C app on speed is not something that experienced Python
programmers would take on lightly.

Talk to your colleague about some of these factors: time to write code,
robustness, clarity, ease of maintenance.

Cheers,
John

John Machin · Nov 18, 2006

If you can assume that all items have 6 numbers, it appears best to
unroll the inner iteration.

Is this meant to be some kind of joke?
If so, you should have festooned it with smilies.
If not, please proceed straight to http://www.thedailyWTF.com and
nominate yourself.

Gabriel Genellina · Nov 18, 2006

Is this meant to be some kind of joke?
If so, you should have festooned it with smilies.
If not, please proceed straight to http://www.thedailyWTF.com and
nominate yourself.

....?

--
Gabriel Genellina
Softlab SRL

__________________________________________________
Correo Yahoo!
Espacio para todos tus mensajes, antivirus y antispam ¡gratis!
¡Abrí tu cuenta ya! - http://correo.yahoo.com.ar

DarkBlue · Nov 18, 2006

Just to show how much a system set up
impacts these results:
Result from suse10.1 64 , python 2.4
with AMD FX-55 cpu and about 12 active apps
running in the background. 7200rpm sata drives.

Preparing data...
[write_data1] Preparing output file...
[write_data1] Writing...
[write_data1] Done in 5.43 seconds.
[write_data4] Preparing output file...
[write_data4] Writing...
[write_data4] Done in 4.41 seconds.
[write_data_flush] Preparing output file...
[write_data_flush] Writing...
[write_data_flush] Done in 5.41 seconds.
[write_data_per_line] Preparing output file...
[write_data_per_line] Writing...
[write_data_per_line] Done in 4.4 seconds.
[write_data_once] Preparing output file...
[write_data_once] Writing...
[write_data_once] Done in 4.28 seconds.

John Machin · Nov 18, 2006

Gabriel said:
...?

We already have a case where the best response to the OP was like
Paddy's response, *not* to answer the question literally.

Then: "loop unrolling"? "assume" with no comments and no assertions?

nnorwitz · Nov 18, 2006

Chris said:
This is just some dummy code to mimic what's being done in the real
code. The actual code is python which is used as a scripting language in
a third party app. The data structure returned by the app is more or
less like the "data" list in the code below. The test for "ELEMENT" is
necessary ... it just evaluates to true every time in this test code. In
the real app perhaps 90% of tests will also be true.

As others have said, without info about what's happening in C, there's
no way to know what's equivalent or fast enough.

So my question is how can I speed up what's happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).

Generally, don't create objects, don't perform repeated operations. In
this case, batch up I/O.

def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

def write_data1(out, data, map=map, str=str):
SPACE_JOIN = ' '.join
lines = [("ELEMENT %06d " % i1) + SPACE_JOIN(map(str, i2))
for i0, i1, i2 in data if i0 == 'ELEMENT']
out.write('\n'.join(lines))

While perhaps a bit obfuscated, it's a bit faster than the original.
Part of what makes this hard to read is the crappy variable names. I
didn't know what to call them. This version assumes that data will
always be a sequence of 3-element items.

The original version took about 11.5 seconds, the version above takes
just over 5 seconds.

YMMV,
n

Fredrik Lundh · Nov 18, 2006

Generally, don't create objects, don't perform repeated operations. In
this case, batch up I/O.

def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

Click to expand...

def write_data1(out, data, map=map, str=str):
SPACE_JOIN = ' '.join
lines = [("ELEMENT %06d " % i1) + SPACE_JOIN(map(str, i2))
for i0, i1, i2 in data if i0 == 'ELEMENT']
out.write('\n'.join(lines))

While perhaps a bit obfuscated, it's a bit faster than the original.
Part of what makes this hard to read is the crappy variable names. I
didn't know what to call them. This version assumes that data will
always be a sequence of 3-element items.

The original version took about 11.5 seconds, the version above takes
just over 5 seconds.

footnote: your version doesn't print the final "\n". here's a variant
that do, and leaves the batching to the I/O subsystem:

def write_data3(out, data, map=map, str=str):
SPACE_JOIN = ' '.join
out.writelines(
"ELEMENT %06d %s\n" % (i1, SPACE_JOIN(map(str, i2)))
for i0, i1, i2 in data if i0 == 'ELEMENT'
)

this runs exactly as fast as your example on my machine, but uses less
memory. and if you, for benchmarking purposes, pass in a "sink" file
object that ignores the data you pass it, it runs in no time at all ;-)

</F>

Peter Otten · Nov 18, 2006

Chris said:
So my question is how can I speed up what's happening inside the
function write_data()? Only allowed to use vanilla python (no psycho or
other libraries outside of a vanilla python install).

def write_data1(out, data):
for i in data:
if i[0] is 'ELEMENT':
out.write("%s %06d " % (i[0], i[1]))
for j in i[2]:
out.write("%d " % (j))
out.write("\n")

# reference, modified to avoid trailing ' '
def write_data(out, data):
for i in data:
if i[0] == 'ELEMENT':
out.write("%s %06d" % (i[0], i[1]))
for j in i[2]:
out.write(" %d" % j)
out.write("\n")

# Norvitz/Lundh
def writelines_data(out, data, map=map, str=str):
SPACE_JOIN = ' '.join
out.writelines(
"ELEMENT %06d %s\n" % (i1, SPACE_JOIN(map(str, i2)))
for i0, i1, i2 in data if i0 == 'ELEMENT'
)

def print_data(out, data):
for name, index, items in data:
if name == "ELEMENT":
print >> out, "ELEMENT %06d" % index,
for item in items:
print >> out, item,
print >> out

import time

data = []
for i in range(500000):
data.append(("ELEMENT", i, (1,2,3,4,5,6)))

for index, write in enumerate([write_data, writelines_data, print_data]):
fname = "test%s.txt" % index
out = open(fname,'w')
start = time.time()
write(out, data)
out.close()
print write.__name__, time.time()-start

for fname in "test1.txt", "test2.txt":
assert open(fname).read() == open("test0.txt").read(), fname

Output on my machine:

$ python2.5 writedata.py
write_data 10.3382301331
writelines_data 5.4960360527
print_data 3.50765490532

Moral: don't forget about good old print. It does have an opcode(*) of its
own, after all.

Peter

(*) or two

nnorwitz · Nov 18, 2006

Peter said:
# Norvitz/Lundh
def writelines_data(out, data, map=map, str=str):
SPACE_JOIN = ' '.join
out.writelines(
"ELEMENT %06d %s\n" % (i1, SPACE_JOIN(map(str, i2)))
for i0, i1, i2 in data if i0 == 'ELEMENT'
)

def print_data(out, data):
for name, index, items in data:
if name == "ELEMENT":
print >> out, "ELEMENT %06d" % index,
for item in items:
print >> out, item,
print >> out

Output on my machine:

$ python2.5 writedata.py
write_data 10.3382301331
writelines_data 5.4960360527
print_data 3.50765490532

Interesting. I timed with python2.4 and get this:

write_data 12.3158090115
writelines_data 5.02135300636
print_data 5.01881980896

A second run yielded:

write_data 11.5980260372
writelines_data 4.8575668335
print_data 4.84622001648

I'm surprised by your numbers a bit because I would expect string ops
to be faster in 2.5 than in 2.4 thanks to /F. I don't remember other
changes that would cause such an improvement for print between 2.4 and
2.5. (2.3 shows print doing a bit better than the times above.)

It could be that the variability is high due to lots of I/O or even
different builds. I'm on Linux.

Moral: don't forget about good old print. It does have an opcode(*) of its
own, after all.

Using print really should be faster as less objects are created.

(*) or two

or 5

$ grep 'case PRINT_' Python/ceval.c
case PRINT_EXPR:
case PRINT_ITEM_TO:
case PRINT_ITEM:
case PRINT_NEWLINE_TO:
case PRINT_NEWLINE:

n

How can I execute a function ONLY if fetch request returns 404 status?	0	Sep 17, 2022
How can I assign to btnReviewerHistory like $newReviewerModal.on('hidden.bs.modal', function (e) { resetModal(); });	0	Sep 26, 2023
How do i Do this function(dealing with arrays)	1	Dec 10, 2021
How can I view / open / render / display a pdf file with c code?	0	Sep 23, 2023
How can I simply view old MS BASIC V7 files on my Win10 PC?	2	Aug 22, 2022
Issue with passing fetched data to POST form. How can I?	0	Jul 23, 2023
How can I add React 18 to existing HTML?	2	Mar 27, 2023
How to fix this code?	1	Sep 22, 2023

How can I speed this function up?

Chris

Terry Reedy

Chris

Guest

Gabriel Genellina

Tim Hochberg

Paddy

John Machin

John Machin

Gabriel Genellina

DarkBlue

John Machin

nnorwitz

Fredrik Lundh

Peter Otten

nnorwitz

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads