Performance of int/long in Python 3

Chris Angelico · Mar 25, 2013

The Python 3 merge of int and long has effectively penalized
small-number arithmetic by removing an optimization. As we've seen
from PEP 393 strings (jmf aside), there can be huge benefits from
having a single type with multiple representations internally. Is
there value in making the int type have a machine-word optimization in
the same way?

The cost is clear. Compare these methods for calculating the sum of
all numbers up to 65535, which stays under 2^31:

def range_sum(n):
return sum(range(n+1))

def forloop(n):
tot=0
for i in range(n+1):
tot+=i
return tot

def forloop_offset(n):
tot=1000000000000000
for i in range(n+1):
tot+=i
return tot-1000000000000000

import timeit
import sys
print(sys.version)
print("inline: %d"%sum(range(65536)))
print(timeit.timeit("sum(range(65536))",number=1000))
for func in ['range_sum','forloop','forloop_offset']:
print("%s: %r"%(func,(globals()[func](65535))))
print(timeit.timeit(func+"(65535)","from __main__ import "+func,number=1000))

Windows XP:
C:\>python26\python inttime.py
2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)]
inline: 2147450880
2.36770455463
range_sum: 2147450880
2.61778550067
forloop: 2147450880
7.91409131608
forloop_offset: 2147450880L
23.3116954809

C:\>python33\python inttime.py
3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600 32 bit (Intel)]
inline: 2147450880
5.25038713020789
range_sum: 2147450880
5.412975112758745
forloop: 2147450880
17.875799577879313
forloop_offset: 2147450880
19.31672544974291

Debian Wheezy:
rosuav@sikorsky:~$ python inttime.py
2.7.3 (default, Jan 2 2013, 13:56:14)
[GCC 4.7.2]
inline: 2147450880
1.92763710022
range_sum: 2147450880
1.93409109116
forloop: 2147450880
5.14633893967
forloop_offset: 2147450880
5.13459300995
rosuav@sikorsky:~$ python3 inttime.py
3.2.3 (default, Feb 20 2013, 14:44:27)
[GCC 4.7.2]
inline: 2147450880
2.884124994277954
range_sum: 2147450880
2.6586129665374756
forloop: 2147450880
7.660192012786865
forloop_offset: 2147450880
8.11817193031311

On 2.6/2.7, there's a massive penalty for switching to longs; on
3.2/3.3, the two for-loop versions are nearly identical in time.

(Side point: I'm often seeing that 3.2 on Linux is marginally faster
calling my range_sum function than doing the same thing inline. I do
not understand this. If anyone can explain what's going on there, I'm
all ears!)

Python 3's int is faster than Python 2's long, but slower than Python
2's int. So the question really is, would a two-form representation be
beneficial, and if so, is it worth the coding trouble?

ChrisA

Cousin Stanley · Mar 25, 2013

Chris said:
The Python 3 merge of int and long has effectively penalized
small-number arithmetic by removing an optimization.
....
The cost is clear.
....

The cost isn't quite as clear
under Debian Wheezy here ....

Stanley C. Kitching
Debian Wheezy

python inline range_sum forloop forloop_offset

2.7.3 3.1359 3.0725 9.0778 15.6475

3.2.3 2.8226 2.8074 13.47624 13.6430

# ---------------------------------------------------------

Chris Angelico
Debian Wheezy

python inline range_sum forloop forloop_offset

2.7.3 1.9276 1.9341 5.1463 5.1346

3.2.3 2.8841 2.6586 7.6602 8.1182

Dan Stromberg · Mar 26, 2013

I thought I heard that Python 3.x will use machine words for small
integers, and automatically coerce internally to a 2.x long as needed.

Either way, it's better to have a small performance cost to avoid problems
when computers move from 32 to 64 bit words, or 64 bit to 128 bit words.
With 3.x int's, you don't have to worry about a new crop of CPU's breaking
your code.

Chris Angelico · Mar 26, 2013

The cost isn't quite as clear
under Debian Wheezy here ....

Stanley C. Kitching
Debian Wheezy

python inline range_sum forloop forloop_offset

2.7.3 3.1359 3.0725 9.0778 15.6475

3.2.3 2.8226 2.8074 13.47624 13.6430

Interesting, so your 3.x sum() is optimizing something somewhere.
Strange. Are we both running the same Python? I got those from
apt-get, aiming for consistency (rather than building a 3.3 from
source).

The cost is still visible in the for-loop versions, though, and you're
still seeing the <2^31 and >2^31 for-loops behave the same way in 3.x
but perform quite differently in 2.x. So it's looking like things are
mostly the same.

ChrisA

Cousin Stanley · Mar 26, 2013

Chris said:
Interesting, so your 3.x sum() is optimizing something somewhere.
Strange. Are we both running the same Python ?

I got those from apt-get
....

I also installed python here under Debian Wheezy
via apt-get and our versions look to be the same ....

-sk-

2.7.3 (default, Jan 2 2013, 16:53:07) [GCC 4.7.2]

3.2.3 (default, Feb 20 2013, 17:02:41) [GCC 4.7.2]

CPU : Intel(R) Celeron(R) D CPU 3.33GHz

-ca-

2.7.3 (default, Jan 2 2013, 13:56:14) [GCC 4.7.2]

3.2.3 (default, Feb 20 2013, 14:44:27) [GCC 4.7.2]

CPU : ???

Could differences in underlying CPU architecture
lead to our differing python integer results ?

Chris Angelico · Mar 26, 2013

Chris said:
Chris said:

Interesting, so your 3.x sum() is optimizing something somewhere.
Strange. Are we both running the same Python ?

I got those from apt-get
....

Click to expand...

I also installed python here under Debian Wheezy
via apt-get and our versions look to be the same ....

-sk-

2.7.3 (default, Jan 2 2013, 16:53:07) [GCC 4.7.2]

3.2.3 (default, Feb 20 2013, 17:02:41) [GCC 4.7.2]

CPU : Intel(R) Celeron(R) D CPU 3.33GHz

-ca-

2.7.3 (default, Jan 2 2013, 13:56:14) [GCC 4.7.2]

3.2.3 (default, Feb 20 2013, 14:44:27) [GCC 4.7.2]

CPU : ???

Could differences in underlying CPU architecture
lead to our differing python integer results ?

Doubtful. I have Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz quad-core
with hyperthreading, but I'm only using one core for this job. I've
run the tests several times and each time, Py2 is a shade under two
seconds for inline/range_sum, and Py3 is about 2.5 seconds for each.
Fascinating.

Just for curiosity's sake, I spun up the tests on my reiplophobic
server, still running Ubuntu Karmic. Pentium(R) Dual-Core CPU
E6500 @ 2.93GHz.

gideon@gideon:~$ python inttime.py
2.6.4 (r264:75706, Dec 7 2009, 18:45:15)
[GCC 4.4.1]
inline: 2147450880
2.7050409317
range_sum: 2147450880
2.64918494225
forloop: 2147450880
6.58765792847
forloop_offset: 2147450880L
16.5167789459
gideon@gideon:~$ python3 inttime.py
3.1.1+ (r311:74480, Nov 2 2009, 14:49:22)
[GCC 4.4.1]
inline: 2147450880
4.44533085823
range_sum: 2147450880
4.37314105034
forloop: 2147450880
12.4834370613
forloop_offset: 2147450880
13.5000522137

Once again, Py3 is slower on small integers than Py2. So where's the
difference with your system? This is really weird! I assume you can
repeat the tests and get the same result every time?

ChrisA

Cousin Stanley · Mar 26, 2013

Chris said:
Once again, Py3 is slower on small integers than Py2.

Chris Angelico
Ubuntu Karmic.
Pentium(R) Dual-Core CPU E6500 @ 2.93GHz.

python inline range_sum forloop forloop_offset

2.6.4 2.7050 2.6492 6.5877 16.5168

3.1.1 4.4453 4.3731 12.4834 13.5001

You do seem to have a slight py3 improvement
under ubuntu for the forloop_offset case ....

So where's the difference with your system ?

CPU ????

This is really weird !

Yep ...

I assume you can repeat the tests
and get the same result every time ?

Yes ....

First lines of numbers below are from yesterday
while second lines are from today ....

Stanley C. Kitching
Debian Wheezy
Intel(R) Celeron(R) D CPU 3.33GH Single Core

python inline range_sum forloop forloop_offset

2.7.3 3.1359 3.0725 9.0778 15.6475
2.7.3 3.0382 3.1452 9.8799 16.8579

3.2.3 2.8226 2.8074 13.47624 13.6430
3.2.3 2.8331 2.8228 13.54151 13.8716

Chris Angelico · Mar 26, 2013

Chris Angelico
Ubuntu Karmic.
Pentium(R) Dual-Core CPU E6500 @ 2.93GHz.

python inline range_sum forloop forloop_offset

2.6.4 2.7050 2.6492 6.5877 16.5168

3.1.1 4.4453 4.3731 12.4834 13.5001

You do seem to have a slight py3 improvement
under ubuntu for the forloop_offset case ....

Yes, that's correct. The forloop_offset one is using long integers in
all cases. (Well, on Py2 it's adding a series of ints to a long, but
the arithmetic always has to be done with longs.) Python 3 has had
some improvements done, but the main thing is that there's a massive
spike in the Py2 time, while Py3 has _already paid_ that cost - as
evidenced by the closeness of the forloop and forloop_offset times on
Py3.

ChrisA

Terry Reedy · Mar 26, 2013

CPU ????

Compilers and compiler settings can also make a difference.

jmfauth · Mar 26, 2013

The Python 3 merge of int and long has effectively penalized
small-number arithmetic by removing an optimization. As we've seen
from PEP 393 strings (jmf aside), there can be huge benefits from
having a single type with multiple representations internally ...

Chris Angelico · Mar 26, 2013

So?

ChrisA

jmfauth · Mar 26, 2013

So?

ChrisA

A character is not an integer.

jmf

Mark Lawrence · Mar 26, 2013

A character is not an integer.

jmf

But you are an idiot.

Chris Angelico · Mar 26, 2013

A character is not an integer.

Yes, I heard you the first time. And I repeat: A needle pulling thread?

You have not made any actual, uhh, _point_.

ChrisA

Grant Edwards · Mar 26, 2013

But you are an idiot.

I think we all agree that jmf is a character.

So we've established that no characters are integers, but some
characters are idiots.

Does that allow us to determine wheter integers are idiots or not?

Chris Angelico · Mar 26, 2013

I think we all agree that jmf is a character.

So we've established that no characters are integers, but some
characters are idiots.

Does that allow us to determine wheter integers are idiots or not?

No, it doesn't. I'm fairly confident that most of them are not...
however, I have my eye on 42. He gets around, a bit, but never seems
to do anything very useful. I'd think twice before hiring him.

But 1, now, he's a good fellow. Even when things get divisive, he's
the voice of unity.

ChrisA

Mark Lawrence · Mar 26, 2013

No, it doesn't. I'm fairly confident that most of them are not...
however, I have my eye on 42. He gets around, a bit, but never seems
to do anything very useful. I'd think twice before hiring him.

But 1, now, he's a good fellow. Even when things get divisive, he's
the voice of unity.

ChrisA

Which reminds me, why do people on newsgroups often refer to 101, my
favourite number? I mean, do we really care about the number of a room
that Eric Blair worked in when he was at the BBC?

Dave Angel · Mar 26, 2013

No, it doesn't. I'm fairly confident that most of them are not...
however, I have my eye on 42. He gets around, a bit, but never seems
to do anything very useful. I'd think twice before hiring him.

Ah, 42, the "Answer to Life, the Universe, and Everything"

Gregory Ewing · Mar 26, 2013

No, it doesn't. I'm fairly confident that most of them are not...
however, I have my eye on 42.

He thought he was equal to 6 x 9 at one point, which
seems pretty idiotic to me.

Dave Angel · Mar 26, 2013

He thought he was equal to 6 x 9 at one point, which
seems pretty idiotic to me.

Not in base 13.

Do you know any other interesting features about coding in Python?	5	Sep 17, 2023
range() vs xrange() Python2\|3 issues for performance	11	Aug 2, 2011
Python battle game help	2	Feb 23, 2023
Python code problem	2	Apr 23, 2023
Why is Python telling me variable is local not global?	3	Sep 2, 2023
Rock paper scissors in python with "algorithm"	1	Feb 27, 2022
performance of tight loop	8	Dec 14, 2010
performance of script to write very long lines of random chars	15	Apr 11, 2013

Performance of int/long in Python 3

Chris Angelico

Cousin Stanley

Dan Stromberg

Chris Angelico

Cousin Stanley

Chris Angelico

Cousin Stanley

Chris Angelico

Terry Reedy

jmfauth

Chris Angelico

jmfauth

Mark Lawrence

Chris Angelico

Grant Edwards

Chris Angelico

Mark Lawrence

Dave Angel

Gregory Ewing

Dave Angel

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads