int() 24 times slower then long() in Python 2.3

W

Willem

When I run the follwing code using Python 2.3:

from time import clock
t1 = clock ()
for i in range (10000): a = int ('bbbbaaaa', 16)
t2 = clock ()
for i in range (10000): a = long ('bbbbaaaa', 16)
t3 = clock ()
print (t2-t1) / (t3-t2)

it prints out: 23.9673206147

That seems to mean that the int-conversion is 24 times slower than the
long conversion.
The reason is that in Python 2.3 the int-conversion generates
warning messages that you *never* see but that consume a *lot* of CPU.

So it may happen that old code performing well under Python 2.2 suddenly
slows down a considerable amount under Python 2.3 without any percievable
change in the output ...

Willem Vree

BTW.
The message that you don't see is: OverflowWarning: string/unicode conversion
and you can make it appear by starting the progam with
Python -Wall script.py
or including the following lines in the source text:
import warnings
warnings.resetwarnings()

I suppose that Python is getting over-engineered by too professional
programmers. The first signs of decay?
 
P

Peter Otten

Willem said:
When I run the follwing code using Python 2.3:

from time import clock
t1 = clock ()
for i in range (10000): a = int ('bbbbaaaa', 16)
t2 = clock ()
for i in range (10000): a = long ('bbbbaaaa', 16)
t3 = clock ()
print (t2-t1) / (t3-t2)

it prints out: 23.9673206147

That seems to mean that the int-conversion is 24 times slower than the
long conversion.
The reason is that in Python 2.3 the int-conversion generates
warning messages that you *never* see but that consume a *lot* of CPU.
I suppose that Python is getting over-engineered by too professional
programmers. The first signs of decay?

No. I think warning.py needs some _more_ engineering - throw in a dictionary
to speed up repetetive calls or so. In the meantime:

import warnings
def dummy(*args, **kw):
pass
warnings.warn = dummy

Now watch it speed up :)

Peter
 
J

James Henderson

When I run the follwing code using Python 2.3:

from time import clock
t1 = clock ()
for i in range (10000): a = int ('bbbbaaaa', 16)
t2 = clock ()
for i in range (10000): a = long ('bbbbaaaa', 16)
t3 = clock ()
print (t2-t1) / (t3-t2)

it prints out: 23.9673206147

That seems to mean that the int-conversion is 24 times slower than the
long conversion.
The reason is that in Python 2.3 the int-conversion generates
warning messages that you *never* see but that consume a *lot* of CPU.

So it may happen that old code performing well under Python 2.2 suddenly
slows down a considerable amount under Python 2.3 without any percievable
change in the output ...

I don't think this code did work perfectly well under Python2.2 (see below),
or am I missing something?

Python 2.2.2 (#1, Feb 24 2003, 19:13:11)
[GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.....
Traceback (most recent call last):
 
I

Istvan Albert

Willem wrote:

So it may happen that old code performing well under Python 2.2 suddenly
slows down a considerable amount under Python 2.3 without any percievable
change in the output ...

IMO the speed at which a bunch of invalid conversions are
executed means nothing at all. Could you come up with an example
that show the same symptoms in a meaningful context?

i.
 
N

Nick Smallbone

Istvan Albert said:
Willem wrote:



IMO the speed at which a bunch of invalid conversions are
executed means nothing at all. Could you come up with an example
that show the same symptoms in a meaningful context?

What do you mean? bbbbaaaa is a hex number.
3149638314L
 
P

Paul Rubin

Nick Smallbone said:
What do you mean? bbbbaaaa is a hex number.

3149638314L

It's not an int. It has to attempt to convert to int, trap the error
and recover from it, and then convert to a long.
 
P

paolo veronelli

It's not an int. It has to attempt to convert to int, trap the error
and recover from it, and then convert to a long.
I don't undersand the meaning of int()
If I want an int() (two bytes?) I want two bytes.It should truncate.If the
result is the same with long() a part from the warning
I think int() is unmeaningfull.Transparency is away from warnings.
This python is saying I used the wrong function and there at least two
cases:
I have wrong results.
He is doing my businness.

IMHO I want wrong results.
 
J

James Henderson

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I don't undersand the meaning of int()
If I want an int() (two bytes?) I want two bytes.It should truncate.If the
result is the same with long() a part from the warning
I think int() is unmeaningfull.Transparency is away from warnings.
This python is saying I used the wrong function and there at least two
cases:
I have wrong results.
He is doing my businness.

IMHO I want wrong results.

I think you should look at PEP 237 "Unifying Long Integers and Integers" - it
may even reassure you. As Peter Hansen has already hinted in reply to
another of your messages, Python is in the process of unifying longs and
ints.

It used to be that calling int() on a big number gave an exception - perhaps a
more Pythonic version of your requested giving the wrong result - and now
that it doesn't I suppose there is a sense in which the current distinction
between ints and longs is "unmeaningful" from the user's point of view (it's
not meaningless under the hood of course). According to the PEP the decision
to keep two types was that it:

is far easier to implement, is backwards
compatible at the C API level, and in addition can be implemented
partially as a transitional measure.

Perhaps you thing that ints and longs should not be unified but I won't start
arguing till you come out and say it. :)

James
- --
James Henderson, Logical Progression Ltd
http://www.logicalprogression.net
http://mailmanager.sourceforge.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFA9QYUd1yXQ13iWmgRAmmSAKDWOH+vv9RAvPapemqXDSTax+P1uQCgzbbc
AHRxEquLPpWE32tasFiJOsw=
=taJY
-----END PGP SIGNATURE-----
 
P

paolo veronelli

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I think you should look at PEP 237 "Unifying Long Integers and Integers"
- it
may even reassure you. As Peter Hansen has already hinted in reply to
another of your messages, Python is in the process of unifying longs and
ints.

It used to be that calling int() on a big number gave an exception -
perhaps a
more Pythonic version of your requested giving the wrong result - and now
that it doesn't I suppose there is a sense in which the current
distinction
between ints and longs is "unmeaningful" from the user's point of view
(it's
not meaningless under the hood of course). According to the PEP the
decision
to keep two types was that it:

is far easier to implement, is backwards
compatible at the C API level, and in addition can be implemented
partially as a transitional measure.

Perhaps you thing that ints and longs should not be unified but I won't
start
arguing till you come out and say it. :)

James
I don't want to change any way the base of PEP 237,but this kind of
performance
flaw is a problem and it's possible to be the first of a long list.
It's good to know, always ,why we pay a price.Better an Error then a ghost
warning,which
is itself the cause of the flaw.
Python is fast ,this is why I use it.Anyone has a different goal in using
it.

Regards Paolino
 
J

James Henderson

I don't want to change any way the base of PEP 237,but this kind of
performance
flaw is a problem and it's possible to be the first of a long list.
It's good to know, always ,why we pay a price.Better an Error then a ghost
warning,which
is itself the cause of the flaw.
Python is fast ,this is why I use it.Anyone has a different goal in using
it.

Regards Paolino

I actually agree it's a shame that even ignored warnings have such a
performance impact but at least these warnings are scheduled to be removed
sometime in Python 2.4 (they're still there in 2.4a1 though). Apparently
they were added for "those concerned about extreme backwards compatibility".

James
 
W

Willem

James Henderson said:
I actually agree it's a shame that even ignored warnings have such a
performance impact but at least these warnings are scheduled to be removed
sometime in Python 2.4 (they're still there in 2.4a1 though). Apparently
they were added for "those concerned about extreme backwards compatibility".

James

That was indeed the reason of my original posting: As a casual
programmer I was disturbed that upgrading to Python 2.3 actually
slowed down my program significantly and that I had to spend so much
time to find the reason: somebody has decided to hide these particular
overflow warnings ...
Why warning and then suppressing it by default?
I understand the problem will disappear in the next version.
I apologize for my remark on "over-engineering". I posted the message
when I was still angry, having spent so much time in finding the "bug"

Willem
 
J

James Henderson

That was indeed the reason of my original posting: As a casual
programmer I was disturbed that upgrading to Python 2.3 actually
slowed down my program significantly

As I said before, I don't think the particular code you posted would have
worked at all before 2.3.
and that I had to spend so much
time to find the reason: somebody has decided to hide these particular
overflow warnings ...
Why warning and then suppressing it by default?

The warnings are there because some people may want to know about when the
type conversion is happening, since it's a new feature.

They're suppressed by default because most people won't want to know. :)

J.
 
W

Willem

James Henderson said:
As I said before, I don't think the particular code you posted would have
worked at all before 2.3.

No, indeed. I tried to make a small program that demonstrates the
problem.
But since you are interested, I have made another more accurate
demonstration of my problem.

Willem

--------------------demonstration---------------------------
# we want to read lots of 32 bit CRC values form a text file
# the numbers are hexadecimal ASCII-strings and have to be
# converted to 32-bit signed integers (which happens to be
# the output of zlib.crc32)
import warnings
from time import clock

# be compatible with Python 2.2 for integer overflow
warnings.filterwarnings ('error', category=OverflowWarning)

# my original 8char-hex to 32bit-int conversion in Python 2.2
# it runs much faster for positive numbers than for negative ones
# but is on average faster then an if-based version
a = -158933416 # an example number, make it positive to see quite
# different results
b = '%08x' % a # the hex ascii string
print a, b
t1 = clock()
for i in range (10000):
try: c = int (b, 16)
except: c = - int (0x100000000 - long (b, 16))
t2 = clock()
print t2 - t1, c, type (c)

# straight forward adaptation to Python 2.3: replace try by if.
# first we set warnings to the default
# (this version does not work in Python 2.2)
warnings.filterwarnings ('ignore', category=OverflowWarning)
t1 = clock()
for i in range (10000):
c = int (b, 16)
if c > 0x7fffffff: c = - int (0x100000000 - c)
t2 = clock()
print t2 - t1, c, type (c)

# this adaptation runs 10 times faster then the previous version
# in Python 2.3, but is on average slower than the first version
# in Python 2.2. This is certainly not an obvious adaptation.
# Nobody would expect a long conversion to be, on average, faster
# than an int conversion for 32 bit numbers ...
t1 = clock()
for i in range (10000):
c = long (b, 16)
if c > 0x7fffffff: c = - int (0x100000000 - c)
else: c = int(c)
t2 = clock()
print t2 - t1, c, type (c)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top