loop beats generator expr creating large dict!?

G

George Young

[Python 2.5c2 (r25c2:51859, Sep 17 2006, 19:57:40), sparc solaris]

I am puzzled that creating large dicts with an explicit iterable of
key,value pairs seems to be slow. I thought to save time by doing:

palettes = dict((w,set(w)) for w in words)

instead of:

palettes={}
for w in words:
palettes[w]=set(w)

where words is a list of 200000 english words. But, in fact,
timeit shows the generator expression takes 3.0 seconds
and the "for" loop 2.1 seconds. Am I missing something?

For good measure, I did an (old fashioned?!) list comprehension:

palettes = dict([(w,set(w)) for w in words])

which took 3.1 seconds. So it seems that, by either list
comprehension or generator expression, creating a large
dictionary all at once is slower than looping to set individual
entries. This seems illogical. Shouldn't dict be able to take
advantage of having all it's entries at once to create itself
faster?

-- George Young
 
B

Ben Cartwright

George said:
I am puzzled that creating large dicts with an explicit iterable of
key,value pairs seems to be slow. I thought to save time by doing:

palettes = dict((w,set(w)) for w in words)

instead of:

palettes={}
for w in words:
palettes[w]=set(w)

where words is a list of 200000 english words. But, in fact,
timeit shows the generator expression takes 3.0 seconds
and the "for" loop 2.1 seconds. Am I missing something?

Creating those 200,000 (w, set(w)) intermediate tuples isn't free. You
aren't doing that in for loop version. If you were:

# Slowest of all!
palettes={}
for w,s in ((w,set(w)) for w in words):
palettes[w]=s

--Ben
 
A

Ant

George Young wrote:
....
I am puzzled that creating large dicts with an explicit iterable of
key,value pairs seems to be slow. I thought to save time by doing:

palettes = dict((w,set(w)) for w in words)

instead of:

palettes={}
for w in words:
palettes[w]=set(w)

In the generator case, you are first creating a tuple, and then
assigning the first element of the tuple as the key and the second
element as the value in the dictionary. In the loop, you are directly
setting the key and value in the dictionary.

Essentially by doing the generator (or comprehension), you are packing
and unpacking the key, value pairs for each word in your initial list,
all of which will have an overhead.
 
D

David Isaac

Does George's example raise the question:
why do dictionaries not implement efficient creation
for two common cases?

- Making a dict from two sequences of the same length.
- Making a dict from a sequence and a function
(as in George's example in this thread).

The current situation is:
use a loop because the obvious generator approach
is not efficient. Something seems wrong here...

Cheers,
Alan Isaac
 
F

Fredrik Lundh

David said:
The current situation is: use a loop because the obvious generator
approach is not efficient.

"not efficient" compared to what ?

</F>
 
S

Steven Bethard

David said:
Does George's example raise the question:
why do dictionaries not implement efficient creation
for two common cases?

- Making a dict from two sequences of the same length.
- Making a dict from a sequence and a function
(as in George's example in this thread).

Maybe you should lobby for dict comprehensions in Python 3000. It
already has set comprehensions:

http://www.python.org/dev/peps/pep-3100/

STeVe
 
S

Steve Holden

Fredrik said:
David Isaac wrote:




"not efficient" compared to what ?
Compared to understanding what's actually going on and working with
reality as opposed to some superstitious view of how the interpreter
operates, perhaps?

Dag nab it, I'm turning into the effbot!

regards
Steve
 
D

David Isaac

Fredrik Lundh said:
"not efficient" compared to what ?

I already guess that I've missed your point, but to prove it...
I was referring to the beginning of this thread
where George noted that
palettes = dict((w,set(w)) for w in words)
runs slower than
palettes={}
for w in words:
palettes[w]=set(w)
The reason seems obvious: the otiose tuple creation,
but there is no "more efficient" syntax to use with 'dict'.

Alan Isaac
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top