How much slower is dict indexing vs. list indexing?

E

Emin

Dear Experts,

How much slower is dict indexing vs. list indexing (or indexing into a
numpy array)? I realize that looking up a value in a dict should be
constant time, but does anyone have a sense of what the overhead will
be in doing a dict lookup vs. indexing into a list? Some ad hoc tests
I've done indicate that the overhead is less than 15% (i.e., dict
lookups seem to take no more than 15% longer than indexing into a list
and there doesn't seem to be much difference in indexing into a list
vs. a numpy array).

The reason I ask is because I'm wondering how hard I should work to
compute and store an index into a list to avoid repeated hash table
lookups. From my tests, it looks like the answer is basically "don't
bother". Does anyone have information, thoughts, or comments on this?

Thanks,
-Emin
 
S

Steve Holden

Emin said:
Dear Experts,

How much slower is dict indexing vs. list indexing (or indexing into a
numpy array)? I realize that looking up a value in a dict should be
constant time, but does anyone have a sense of what the overhead will
be in doing a dict lookup vs. indexing into a list? Some ad hoc tests
I've done indicate that the overhead is less than 15% (i.e., dict
lookups seem to take no more than 15% longer than indexing into a list
and there doesn't seem to be much difference in indexing into a list
vs. a numpy array).

The reason I ask is because I'm wondering how hard I should work to
compute and store an index into a list to avoid repeated hash table
lookups. From my tests, it looks like the answer is basically "don't
bother". Does anyone have information, thoughts, or comments on this?
I think "don't bother" is a good conclusion. Tim Peters has led the
charge to (as he might put it) "optimize the snot" out of the dict
implementation. Anything you do in Python to speed up access is likely
to add more to execution time than you might save by not exercising the
C-based dict lookup code.

What technique were you thinking of to look up the cached index values
in Python, just as a matter of curiosity? Storing them in a dict? It
would be hard to think of a faster way ... ;-)

regards
Steve
 
E

Emin

What technique were you thinking of to look up the cached index values
in Python, just as a matter of curiosity? Storing them in a dict? It
would be hard to think of a faster way ... ;-)

I didn't have anything fancy in mind. I was just wondering whether it
makes sense to replace a block of code like

data = {'a' : 1, 'b' :2, 'c' : 3}
for i in someLargeList:
for name in ['a','b','c']:
result.append(data[name])

with somthing like

data = {'a' : 1, 'b' :2, 'c' : 3}
dataValues = [data[k] for k in ['a','b','c']
for i in someLargeList:
for index in [0,1,2]:
result.append(dataValues[index])

It sounds like it's probably not worth the effort in general, but might
be for extremely ultra-critical parts of code.

Thanks
 
S

skip

Emin> It sounds like it's probably not worth the effort in general, but
Emin> might be for extremely ultra-critical parts of code.

Even in extremely ultra-critical parts of code I doubt the loss of
readability would be worth it. If there are situations where you can really
gain by switching from a natural indexing scheme to lists, there are
probably other places in your code where you will gain just as much benefit
without the corresponding loss of readability. Indexing lists only appears
to be about twice as fast as indexing dicts:

% timeit.py -s "data = {'a' : 1, 'b' :2, 'c' : 3}" "for k in 'abc': x = data[k]"
100000 loops, best of 3: 4.61 usec per loop
% timeit.py -s "data = [1, 2, 3]" "for k in [0, 1, 2]: x = data[k]"
100000 loops, best of 3: 2.97 usec per loop

If you're worried about regaining a couple microseconds per index you
probably shouldn't be using Python.

Skip
 
P

Paul McGuire

Emin said:
What technique were you thinking of to look up the cached index values
in Python, just as a matter of curiosity? Storing them in a dict? It
would be hard to think of a faster way ... ;-)

I didn't have anything fancy in mind. I was just wondering whether it
makes sense to replace a block of code like

data = {'a' : 1, 'b' :2, 'c' : 3}
for i in someLargeList:
for name in ['a','b','c']:
result.append(data[name])

with somthing like

data = {'a' : 1, 'b' :2, 'c' : 3}
dataValues = [data[k] for k in ['a','b','c']
for i in someLargeList:
for index in [0,1,2]:
result.append(dataValues[index])

[Your as-posted code doesn't run, you are missing a trailing ']' in your
list comprehension. ]

So what you want is this?
1. Get the values from the data dict in order of their key, ['a','b','c']
(which is not the same as getting the data.values() list, which is in
unpredictable order)
2. For every element in some larger list, append each of the elements in
order from step 1 to some other result list.

First of all, realize that:
for index in [0,1,2]:
result.append(dataValues[index])
is the same as
result.extend(dataValues)
assuming that dataValues has exactly 3 elements in it.

Second, why are you iterating over someLargeList? You are doing nothing
with i, and neither the data dict nor the dataValues list changes as you
iterate.

This will do the job more quickly, I should think:

data = {'a' : 1, 'b' :2, 'c' : 3}
dataValues = [data[k] for k in ['a','b','c']]
result.extend( dataValues * len(someLargeList) )

-- Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top