Built-in datatypes speed

  • Thread starter =?ISO-8859-15?Q?Ma=EBl_Benjamin_Mettler?=
  • Start date
?

=?ISO-8859-15?Q?Ma=EBl_Benjamin_Mettler?=

Hello Python-List

I hope somebody can help me with this. I spent some time googling for an
answer, but due to the nature of the problem lots of unrelevant stuff
shows up.

Anyway, I reimplemented parts of TigerSearch (
http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERSearch/ ) in Python.
I am currently writing the paper that goes along with this
reimplementation. Part of the paper deals with the
differences/similarities in the original Java implementation and my
reimplementation. In order to superficially evaluate differences in
speed, I used this paper (
http://www.ubka.uni-karlsruhe.de/cgi-bin/psview?document=ira/2000/5&format=1
) as a reference. Now, this is not about speed differences between Java
and Python, mind you, but about the speed built-in datatypes
(dictionaries, lists etc.) run at. As far as I understood it from the
articles and books I read, any method call from these objects run nearly
at C-speed (I use this due to lack of a better term), since these parts
are implemented in C. Now the question is:

a) Is this true?
b) Is there a correct term for C-speed and what is it?

I would greatly appreciate an answer to that, since this has some impact
on the argumentation in the paper.

Thanks,

Maël

PS: For people interested in this reimplementation project: my code will
be published here (
http://www.ling.su.se/dali/downloads/treealigner/index.htm ) as soon as
it is integrated with the GUI and properly tested. The whole thing is
GPLed...
 
K

Klaas

Anyway, I reimplemented parts of TigerSearch (http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERSearch/) in Python.
I am currently writing the paper that goes along with this
reimplementation. Part of the paper deals with the
differences/similarities in the original Java implementation and my
reimplementation. In order to superficially evaluate differences in
speed, I used this paper (http://www.ubka.uni-karlsruhe.de/cgi-bin/psview?document=ira/2000/5&f...
) as a reference. Now, this is not about speed differences between Java
and Python, mind you, but about the speed built-in datatypes
(dictionaries, lists etc.) run at. As far as I understood it from the
articles and books I read, any method call from these objects run nearly
at C-speed (I use this due to lack of a better term), since these parts
are implemented in C. Now the question is:

a) Is this true?
b) Is there a correct term for C-speed and what is it?

I think the statement is highly misleading. It is true that most of
the underlying operations on native data types are implemented in c.
If the operations themselves are expensive, they could run close to
the speed of a suitably generic c implementation of, say, a
hashtable. But with richer data types, you run good chances of
landing back in pythonland, e.g. via __hash__, __equals__, etc.

Also, method dispatch to c is relatively slow. A loop such as:

lst = []
for i in xrange(int(10e6)):
lst.append(i)

will spend most of its time in method dispatch and iterating, and very
little in the "guts" of append().

Those guts, mind, will be quick.

-Mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top