Built-in datatypes speed

Discussion in 'Python' started by =?ISO-8859-15?Q?Ma=EBl_Benjamin_Mettler?=, Feb 7, 2007.

  1. Hello Python-List

    I hope somebody can help me with this. I spent some time googling for an
    answer, but due to the nature of the problem lots of unrelevant stuff
    shows up.

    Anyway, I reimplemented parts of TigerSearch (
    http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERSearch/ ) in Python.
    I am currently writing the paper that goes along with this
    reimplementation. Part of the paper deals with the
    differences/similarities in the original Java implementation and my
    reimplementation. In order to superficially evaluate differences in
    speed, I used this paper (
    http://www.ubka.uni-karlsruhe.de/cgi-bin/psview?document=ira/2000/5&format=1
    ) as a reference. Now, this is not about speed differences between Java
    and Python, mind you, but about the speed built-in datatypes
    (dictionaries, lists etc.) run at. As far as I understood it from the
    articles and books I read, any method call from these objects run nearly
    at C-speed (I use this due to lack of a better term), since these parts
    are implemented in C. Now the question is:

    a) Is this true?
    b) Is there a correct term for C-speed and what is it?

    I would greatly appreciate an answer to that, since this has some impact
    on the argumentation in the paper.

    Thanks,

    Maël

    PS: For people interested in this reimplementation project: my code will
    be published here (
    http://www.ling.su.se/dali/downloads/treealigner/index.htm ) as soon as
    it is integrated with the GUI and properly tested. The whole thing is
    GPLed...
     
    =?ISO-8859-15?Q?Ma=EBl_Benjamin_Mettler?=, Feb 7, 2007
    #1
    1. Advertisements

  2. =?ISO-8859-15?Q?Ma=EBl_Benjamin_Mettler?=

    Klaas Guest

    I think the statement is highly misleading. It is true that most of
    the underlying operations on native data types are implemented in c.
    If the operations themselves are expensive, they could run close to
    the speed of a suitably generic c implementation of, say, a
    hashtable. But with richer data types, you run good chances of
    landing back in pythonland, e.g. via __hash__, __equals__, etc.

    Also, method dispatch to c is relatively slow. A loop such as:

    lst = []
    for i in xrange(int(10e6)):
    lst.append(i)

    will spend most of its time in method dispatch and iterating, and very
    little in the "guts" of append().

    Those guts, mind, will be quick.

    -Mike
     
    Klaas, Feb 9, 2007
    #2
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.