Performance of int/long in Python 3


C

Chris Angelico

The Python 3 merge of int and long has effectively penalized
small-number arithmetic by removing an optimization. As we've seen
from PEP 393 strings (jmf aside), there can be huge benefits from
having a single type with multiple representations internally. Is
there value in making the int type have a machine-word optimization in
the same way?

The cost is clear. Compare these methods for calculating the sum of
all numbers up to 65535, which stays under 2^31:

def range_sum(n):
return sum(range(n+1))

def forloop(n):
tot=0
for i in range(n+1):
tot+=i
return tot

def forloop_offset(n):
tot=1000000000000000
for i in range(n+1):
tot+=i
return tot-1000000000000000

import timeit
import sys
print(sys.version)
print("inline: %d"%sum(range(65536)))
print(timeit.timeit("sum(range(65536))",number=1000))
for func in ['range_sum','forloop','forloop_offset']:
print("%s: %r"%(func,(globals()[func](65535))))
print(timeit.timeit(func+"(65535)","from __main__ import "+func,number=1000))


Windows XP:
C:\>python26\python inttime.py
2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)]
inline: 2147450880
2.36770455463
range_sum: 2147450880
2.61778550067
forloop: 2147450880
7.91409131608
forloop_offset: 2147450880L
23.3116954809

C:\>python33\python inttime.py
3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600 32 bit (Intel)]
inline: 2147450880
5.25038713020789
range_sum: 2147450880
5.412975112758745
forloop: 2147450880
17.875799577879313
forloop_offset: 2147450880
19.31672544974291

Debian Wheezy:
[email protected]:~$ python inttime.py
2.7.3 (default, Jan 2 2013, 13:56:14)
[GCC 4.7.2]
inline: 2147450880
1.92763710022
range_sum: 2147450880
1.93409109116
forloop: 2147450880
5.14633893967
forloop_offset: 2147450880
5.13459300995
[email protected]:~$ python3 inttime.py
3.2.3 (default, Feb 20 2013, 14:44:27)
[GCC 4.7.2]
inline: 2147450880
2.884124994277954
range_sum: 2147450880
2.6586129665374756
forloop: 2147450880
7.660192012786865
forloop_offset: 2147450880
8.11817193031311


On 2.6/2.7, there's a massive penalty for switching to longs; on
3.2/3.3, the two for-loop versions are nearly identical in time.

(Side point: I'm often seeing that 3.2 on Linux is marginally faster
calling my range_sum function than doing the same thing inline. I do
not understand this. If anyone can explain what's going on there, I'm
all ears!)

Python 3's int is faster than Python 2's long, but slower than Python
2's int. So the question really is, would a two-form representation be
beneficial, and if so, is it worth the coding trouble?

ChrisA
 
Ad

Advertisements

C

Cousin Stanley

Chris said:
The Python 3 merge of int and long has effectively penalized
small-number arithmetic by removing an optimization.
....
The cost is clear.
....

The cost isn't quite as clear
under Debian Wheezy here ....

Stanley C. Kitching
Debian Wheezy

python inline range_sum forloop forloop_offset

2.7.3 3.1359 3.0725 9.0778 15.6475

3.2.3 2.8226 2.8074 13.47624 13.6430


# ---------------------------------------------------------

Chris Angelico
Debian Wheezy

python inline range_sum forloop forloop_offset

2.7.3 1.9276 1.9341 5.1463 5.1346

3.2.3 2.8841 2.6586 7.6602 8.1182
 
D

Dan Stromberg

I thought I heard that Python 3.x will use machine words for small
integers, and automatically coerce internally to a 2.x long as needed.

Either way, it's better to have a small performance cost to avoid problems
when computers move from 32 to 64 bit words, or 64 bit to 128 bit words.
With 3.x int's, you don't have to worry about a new crop of CPU's breaking
your code.
 
C

Chris Angelico

The cost isn't quite as clear
under Debian Wheezy here ....

Stanley C. Kitching
Debian Wheezy

python inline range_sum forloop forloop_offset

2.7.3 3.1359 3.0725 9.0778 15.6475

3.2.3 2.8226 2.8074 13.47624 13.6430

Interesting, so your 3.x sum() is optimizing something somewhere.
Strange. Are we both running the same Python? I got those from
apt-get, aiming for consistency (rather than building a 3.3 from
source).

The cost is still visible in the for-loop versions, though, and you're
still seeing the <2^31 and >2^31 for-loops behave the same way in 3.x
but perform quite differently in 2.x. So it's looking like things are
mostly the same.

ChrisA
 
C

Cousin Stanley

Chris said:
Interesting, so your 3.x sum() is optimizing something somewhere.
Strange. Are we both running the same Python ?

I got those from apt-get
....

I also installed python here under Debian Wheezy
via apt-get and our versions look to be the same ....

-sk-

2.7.3 (default, Jan 2 2013, 16:53:07) [GCC 4.7.2]

3.2.3 (default, Feb 20 2013, 17:02:41) [GCC 4.7.2]

CPU : Intel(R) Celeron(R) D CPU 3.33GHz


-ca-

2.7.3 (default, Jan 2 2013, 13:56:14) [GCC 4.7.2]

3.2.3 (default, Feb 20 2013, 14:44:27) [GCC 4.7.2]

CPU : ???


Could differences in underlying CPU architecture
lead to our differing python integer results ?
 
C

Chris Angelico

Chris said:
Interesting, so your 3.x sum() is optimizing something somewhere.
Strange. Are we both running the same Python ?

I got those from apt-get
....

I also installed python here under Debian Wheezy
via apt-get and our versions look to be the same ....

-sk-

2.7.3 (default, Jan 2 2013, 16:53:07) [GCC 4.7.2]

3.2.3 (default, Feb 20 2013, 17:02:41) [GCC 4.7.2]

CPU : Intel(R) Celeron(R) D CPU 3.33GHz


-ca-

2.7.3 (default, Jan 2 2013, 13:56:14) [GCC 4.7.2]

3.2.3 (default, Feb 20 2013, 14:44:27) [GCC 4.7.2]

CPU : ???


Could differences in underlying CPU architecture
lead to our differing python integer results ?

Doubtful. I have Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz quad-core
with hyperthreading, but I'm only using one core for this job. I've
run the tests several times and each time, Py2 is a shade under two
seconds for inline/range_sum, and Py3 is about 2.5 seconds for each.
Fascinating.

Just for curiosity's sake, I spun up the tests on my reiplophobic
server, still running Ubuntu Karmic. Pentium(R) Dual-Core CPU
E6500 @ 2.93GHz.

[email protected]:~$ python inttime.py
2.6.4 (r264:75706, Dec 7 2009, 18:45:15)
[GCC 4.4.1]
inline: 2147450880
2.7050409317
range_sum: 2147450880
2.64918494225
forloop: 2147450880
6.58765792847
forloop_offset: 2147450880L
16.5167789459
[email protected]:~$ python3 inttime.py
3.1.1+ (r311:74480, Nov 2 2009, 14:49:22)
[GCC 4.4.1]
inline: 2147450880
4.44533085823
range_sum: 2147450880
4.37314105034
forloop: 2147450880
12.4834370613
forloop_offset: 2147450880
13.5000522137

Once again, Py3 is slower on small integers than Py2. So where's the
difference with your system? This is really weird! I assume you can
repeat the tests and get the same result every time?

ChrisA
 
Ad

Advertisements

C

Cousin Stanley

Chris said:
Once again, Py3 is slower on small integers than Py2.

Chris Angelico
Ubuntu Karmic.
Pentium(R) Dual-Core CPU E6500 @ 2.93GHz.

python inline range_sum forloop forloop_offset

2.6.4 2.7050 2.6492 6.5877 16.5168

3.1.1 4.4453 4.3731 12.4834 13.5001

You do seem to have a slight py3 improvement
under ubuntu for the forloop_offset case ....

So where's the difference with your system ?

CPU ????

This is really weird !

Yep ...

I assume you can repeat the tests
and get the same result every time ?

Yes ....

First lines of numbers below are from yesterday
while second lines are from today ....

Stanley C. Kitching
Debian Wheezy
Intel(R) Celeron(R) D CPU 3.33GH Single Core

python inline range_sum forloop forloop_offset

2.7.3 3.1359 3.0725 9.0778 15.6475
2.7.3 3.0382 3.1452 9.8799 16.8579

3.2.3 2.8226 2.8074 13.47624 13.6430
3.2.3 2.8331 2.8228 13.54151 13.8716
 
C

Chris Angelico

Chris Angelico
Ubuntu Karmic.
Pentium(R) Dual-Core CPU E6500 @ 2.93GHz.

python inline range_sum forloop forloop_offset

2.6.4 2.7050 2.6492 6.5877 16.5168

3.1.1 4.4453 4.3731 12.4834 13.5001

You do seem to have a slight py3 improvement
under ubuntu for the forloop_offset case ....

Yes, that's correct. The forloop_offset one is using long integers in
all cases. (Well, on Py2 it's adding a series of ints to a long, but
the arithmetic always has to be done with longs.) Python 3 has had
some improvements done, but the main thing is that there's a massive
spike in the Py2 time, while Py3 has _already paid_ that cost - as
evidenced by the closeness of the forloop and forloop_offset times on
Py3.

ChrisA
 
J

jmfauth

The Python 3 merge of int and long has effectively penalized
small-number arithmetic by removing an optimization. As we've seen
from PEP 393 strings (jmf aside), there can be huge benefits from
having a single type with multiple representations internally ...
 
Ad

Advertisements

C

Chris Angelico

A character is not an integer.

Yes, I heard you the first time. And I repeat: A needle pulling thread?

You have not made any actual, uhh, _point_.

ChrisA
 
G

Grant Edwards

But you are an idiot.

I think we all agree that jmf is a character.

So we've established that no characters are integers, but some
characters are idiots.

Does that allow us to determine wheter integers are idiots or not?
 
C

Chris Angelico

I think we all agree that jmf is a character.

So we've established that no characters are integers, but some
characters are idiots.

Does that allow us to determine wheter integers are idiots or not?

No, it doesn't. I'm fairly confident that most of them are not...
however, I have my eye on 42. He gets around, a bit, but never seems
to do anything very useful. I'd think twice before hiring him.

But 1, now, he's a good fellow. Even when things get divisive, he's
the voice of unity.

ChrisA
 
Ad

Advertisements

M

Mark Lawrence

No, it doesn't. I'm fairly confident that most of them are not...
however, I have my eye on 42. He gets around, a bit, but never seems
to do anything very useful. I'd think twice before hiring him.

But 1, now, he's a good fellow. Even when things get divisive, he's
the voice of unity.

ChrisA

Which reminds me, why do people on newsgroups often refer to 101, my
favourite number? I mean, do we really care about the number of a room
that Eric Blair worked in when he was at the BBC?
 
D

Dave Angel

No, it doesn't. I'm fairly confident that most of them are not...
however, I have my eye on 42. He gets around, a bit, but never seems
to do anything very useful. I'd think twice before hiring him.

Ah, 42, the "Answer to Life, the Universe, and Everything"
 
G

Gregory Ewing

No, it doesn't. I'm fairly confident that most of them are not...
however, I have my eye on 42.

He thought he was equal to 6 x 9 at one point, which
seems pretty idiotic to me.
 
Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top