How much slower is dict indexing vs. list indexing?

Discussion in 'Python' started by Emin, Jan 11, 2007.

  1. Emin

    Emin Guest

    Dear Experts,

    How much slower is dict indexing vs. list indexing (or indexing into a
    numpy array)? I realize that looking up a value in a dict should be
    constant time, but does anyone have a sense of what the overhead will
    be in doing a dict lookup vs. indexing into a list? Some ad hoc tests
    I've done indicate that the overhead is less than 15% (i.e., dict
    lookups seem to take no more than 15% longer than indexing into a list
    and there doesn't seem to be much difference in indexing into a list
    vs. a numpy array).

    The reason I ask is because I'm wondering how hard I should work to
    compute and store an index into a list to avoid repeated hash table
    lookups. From my tests, it looks like the answer is basically "don't
    bother". Does anyone have information, thoughts, or comments on this?

    Thanks,
    -Emin
     
    Emin, Jan 11, 2007
    #1
    1. Advertising

  2. Emin

    Steve Holden Guest

    Emin wrote:
    > Dear Experts,
    >
    > How much slower is dict indexing vs. list indexing (or indexing into a
    > numpy array)? I realize that looking up a value in a dict should be
    > constant time, but does anyone have a sense of what the overhead will
    > be in doing a dict lookup vs. indexing into a list? Some ad hoc tests
    > I've done indicate that the overhead is less than 15% (i.e., dict
    > lookups seem to take no more than 15% longer than indexing into a list
    > and there doesn't seem to be much difference in indexing into a list
    > vs. a numpy array).
    >
    > The reason I ask is because I'm wondering how hard I should work to
    > compute and store an index into a list to avoid repeated hash table
    > lookups. From my tests, it looks like the answer is basically "don't
    > bother". Does anyone have information, thoughts, or comments on this?
    >

    I think "don't bother" is a good conclusion. Tim Peters has led the
    charge to (as he might put it) "optimize the snot" out of the dict
    implementation. Anything you do in Python to speed up access is likely
    to add more to execution time than you might save by not exercising the
    C-based dict lookup code.

    What technique were you thinking of to look up the cached index values
    in Python, just as a matter of curiosity? Storing them in a dict? It
    would be hard to think of a faster way ... ;-)

    regards
    Steve
    --
    Steve Holden +44 150 684 7255 +1 800 494 3119
    Holden Web LLC/Ltd http://www.holdenweb.com
    Skype: holdenweb http://del.icio.us/steve.holden
    Blog of Note: http://holdenweb.blogspot.com
     
    Steve Holden, Jan 11, 2007
    #2
    1. Advertising

  3. Emin

    Emin Guest

    On Jan 11, 5:53 pm, Steve Holden <> wrote:
    >
    > What technique were you thinking of to look up the cached index values
    > in Python, just as a matter of curiosity? Storing them in a dict? It
    > would be hard to think of a faster way ... ;-)


    I didn't have anything fancy in mind. I was just wondering whether it
    makes sense to replace a block of code like

    data = {'a' : 1, 'b' :2, 'c' : 3}
    for i in someLargeList:
    for name in ['a','b','c']:
    result.append(data[name])

    with somthing like

    data = {'a' : 1, 'b' :2, 'c' : 3}
    dataValues = [data[k] for k in ['a','b','c']
    for i in someLargeList:
    for index in [0,1,2]:
    result.append(dataValues[index])

    It sounds like it's probably not worth the effort in general, but might
    be for extremely ultra-critical parts of code.

    Thanks
     
    Emin, Jan 12, 2007
    #3
  4. Emin

    Guest

    Emin> It sounds like it's probably not worth the effort in general, but
    Emin> might be for extremely ultra-critical parts of code.

    Even in extremely ultra-critical parts of code I doubt the loss of
    readability would be worth it. If there are situations where you can really
    gain by switching from a natural indexing scheme to lists, there are
    probably other places in your code where you will gain just as much benefit
    without the corresponding loss of readability. Indexing lists only appears
    to be about twice as fast as indexing dicts:

    % timeit.py -s "data = {'a' : 1, 'b' :2, 'c' : 3}" "for k in 'abc': x = data[k]"
    100000 loops, best of 3: 4.61 usec per loop
    % timeit.py -s "data = [1, 2, 3]" "for k in [0, 1, 2]: x = data[k]"
    100000 loops, best of 3: 2.97 usec per loop

    If you're worried about regaining a couple microseconds per index you
    probably shouldn't be using Python.

    Skip
     
    , Jan 12, 2007
    #4
  5. Emin

    Paul McGuire Guest

    "Emin" <> wrote in message
    news:...
    > On Jan 11, 5:53 pm, Steve Holden <> wrote:
    >>
    >> What technique were you thinking of to look up the cached index values
    >> in Python, just as a matter of curiosity? Storing them in a dict? It
    >> would be hard to think of a faster way ... ;-)

    >
    > I didn't have anything fancy in mind. I was just wondering whether it
    > makes sense to replace a block of code like
    >
    > data = {'a' : 1, 'b' :2, 'c' : 3}
    > for i in someLargeList:
    > for name in ['a','b','c']:
    > result.append(data[name])
    >
    > with somthing like
    >
    > data = {'a' : 1, 'b' :2, 'c' : 3}
    > dataValues = [data[k] for k in ['a','b','c']
    > for i in someLargeList:
    > for index in [0,1,2]:
    > result.append(dataValues[index])
    >


    [Your as-posted code doesn't run, you are missing a trailing ']' in your
    list comprehension. ]

    So what you want is this?
    1. Get the values from the data dict in order of their key, ['a','b','c']
    (which is not the same as getting the data.values() list, which is in
    unpredictable order)
    2. For every element in some larger list, append each of the elements in
    order from step 1 to some other result list.

    First of all, realize that:
    > for index in [0,1,2]:
    > result.append(dataValues[index])

    is the same as
    result.extend(dataValues)
    assuming that dataValues has exactly 3 elements in it.

    Second, why are you iterating over someLargeList? You are doing nothing
    with i, and neither the data dict nor the dataValues list changes as you
    iterate.

    This will do the job more quickly, I should think:

    data = {'a' : 1, 'b' :2, 'c' : 3}
    dataValues = [data[k] for k in ['a','b','c']]
    result.extend( dataValues * len(someLargeList) )

    -- Paul
     
    Paul McGuire, Jan 12, 2007
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?ZGhucml2ZXJzaWRl?=

    out of process session state - how much slower?

    =?Utf-8?B?ZGhucml2ZXJzaWRl?=, Oct 28, 2005, in forum: ASP .Net
    Replies:
    4
    Views:
    405
    JIMCO Software
    Oct 29, 2005
  2. Andre Charbonneau

    XPath queries getting slower and slower...

    Andre Charbonneau, Feb 15, 2005, in forum: Java
    Replies:
    0
    Views:
    557
    Andre Charbonneau
    Feb 15, 2005
  3. Mark
    Replies:
    20
    Views:
    1,677
    Dave O'Hearn
    Dec 27, 2004
  4. Replies:
    27
    Views:
    545
    Gabriel Genellina
    Jun 14, 2007
  5. dustbort
    Replies:
    3
    Views:
    4,014
    JosipJaic
    Jun 18, 2010
Loading...

Share This Page