Python 3.3 vs. MSDOS Basic

Discussion in 'Python' started by John Immarino, Feb 18, 2013.

1. John ImmarinoGuest

I coded a Python solution for Problem #14 on the Project Euler website. I was very surprised to find that it took 107 sec. to run even though it's a pretty simple program. I also coded an equivalent solution for the problem in the old MSDOS basic. (That's the 16 bit app of 1980s vintage.) It ran in 56 sec. Is there a flaw in my coding, or is Python really this slow in this particular application. MSDOS Basic usually runs at a snails pace compared to Python.

Below is the problem and the code:

The following iterative sequence is defined for the set of positive integers:

n â†’ n/2 (n is even)
n â†’ 3n + 1 (n is odd)

Using the rule above and starting with 13, we generate the following sequence:
13 â†’ 40 â†’ 20 â†’ 10 â†’ 5 â†’ 16 â†’ 8 â†’ 4 â†’ 2 â†’ 1

It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

Which starting number, under one million, produces the longest chain?

NOTE: Once the chain starts the terms are allowed to go above one million.

max=0
m=0
while m<=1000000:
m+=1
count=0
n=m
while n!=1:
count+=1
if n%2==0:
n=n//2
else:
n=3*n+1
if count>max:
max=count
num=m
print(num,max)

John Immarino, Feb 18, 2013

2. Ian KellyGuest

On Mon, Feb 18, 2013 at 12:13 PM, John Immarino <> wrote:
> I coded a Python solution for Problem #14 on the Project Euler website. Iwas very surprised to find that it took 107 sec. to run even though it's apretty simple program. I also coded an equivalent solution for the problem in the old MSDOS basic. (That's the 16 bit app of 1980s vintage.) It ranin 56 sec. Is there a flaw in my coding, or is Python really this slow in this particular application. MSDOS Basic usually runs at a snails pace compared to Python.

Well, I don't see anything that looks especially slow in that code,
but the algorithm that you're using is not very efficient. I rewrote
it using dynamic programming (details left as an exercise), which got
the runtime down to about 4 seconds.

Ian Kelly, Feb 18, 2013

3. Chris AngelicoGuest

On Tue, Feb 19, 2013 at 6:13 AM, John Immarino <> wrote:
> I coded a Python solution for Problem #14 on the Project Euler website. Iwas very surprised to find that it took 107 sec. to run even though it's apretty simple program. I also coded an equivalent solution for the problem in the old MSDOS basic. (That's the 16 bit app of 1980s vintage.) It ranin 56 sec. Is there a flaw in my coding, or is Python really this slow in this particular application. MSDOS Basic usually runs at a snails pace compared to Python.

BASIC does a lot less. If you wrote an 8086 assembly language
interpreter in Python, it'd run fairly slowly too Python isn't
really the world's best language for number crunching inside a machine
word; though if this were a major project, I would recommend looking
into Cython, as it lets you translate a few critical portions of your
code to C while leaving the rest in Python.

In order to get some useful stats, I added a little timing code to
your original; on my Windows XP laptop, running Python 3.3, your
version took 212.64 seconds to get to a result (namely, 837799 with a
count of 524).

Here's how I'd code it:

import time
start=time.time()
max=0
for m in range(1,1000001):
n=m
count=0
while n>1:
if n%2: n=3*n+1
else: n//=2
count+=1
if count>max: max,num=count,m
if not m&16383: print("->",m,count)
print(num,max)
print(time.time()-start)

(You'll see the same timing information that I added to yours. It adds
immeasurably to the run-time, and gives some early idea of how it's
going.)

Running under Python 2.6, both your version and mine take about 90
seconds to run. But under Python 3.3, where (among other things)
range() yields values lazily, my version is significantly faster than
yours. BUT! Both versions, under 3.3, are significantly *slower* than
under 2.6. My first thought is that it's because Py2 has different
types for 'int' and 'long', and Py3 doesn't (effectively, everything's
a long), so I added an L suffix to every number and ran each of them
under 2.6 again. Seems that was the bulk of the difference, though not
all.

Pythonistas, does this count as a regression, or is Python
sufficiently "not a number crunching language" that we don't care?

(range = my code, as above; while = original version with a C-style
loop counter)
range py3: 171.07846403121948
while py3: 212.64104509353638
range py2: 87.859000206
while py2: 86.4059998989
range py2 longs: 190.530999899
while py2 longs: 176.125999928

For comparison purposes, I also coded up the equivalent in Pike.
Pike's a very similar language to Python, but with a C-like syntax,
and certain optimizations - including, significantly to this exercise,
an integer type that sits within a machine word if it can (though
it'll happily go arbitrary precision when it's needed to). It pretends
to the programmer that it's a Py3-style "everything's an int", but
underneath, functions more like Py2 with separate short and long
types. The result: 22.649 seconds to reach the same conclusion.

How long did your BASIC version take, and how long did the Python
version on the same hardware?

This sort of pure number crunching isn't really where a modern high
level language shines. You'll come to *really* appreciate Python as
soon as you start working with huge arrays, dictionaries, etc. This is
a job for C, really.

ChrisA

Chris Angelico, Feb 18, 2013
4. Chris AngelicoGuest

On Tue, Feb 19, 2013 at 8:55 AM, Chris Angelico <> wrote:
> How long did your BASIC version take, and how long did the Python
> version on the same hardware?

Which Python version didyou use?

ChrisA

Chris Angelico, Feb 18, 2013
5. Chris AngelicoGuest

On Tue, Feb 19, 2013 at 8:56 AM, Chris Angelico <> wrote:
> On Tue, Feb 19, 2013 at 8:55 AM, Chris Angelico <> wrote:
>> How long did your BASIC version take, and how long did the Python
>> version on the same hardware?

>
> Which Python version didyou use?
>
> ChrisA

Doh. I'm having a great day of not reading properly, today. (I blame
checking mail on the bus, it took me over an hour to read this one
message and I'd forgotten the subject line by the time I got to the
end.) Python 3.3, right there in the header. Disregard me!

ChrisA

Chris Angelico, Feb 18, 2013
6. Chris AngelicoGuest

On Tue, Feb 19, 2013 at 8:54 AM, Ian Kelly <> wrote:
> Well, I don't see anything that looks especially slow in that code,
> but the algorithm that you're using is not very efficient. I rewrote
> it using dynamic programming (details left as an exercise), which got
> the runtime down to about 4 seconds.

Did it involve a dictionary, mapping a value to its count, so that any
time you hit a value you've seen, you can short-cut it? That was my
first optimization consideration, though I didn't implement it in any
version, so as to keep the timings comparable.

ChrisA

Chris Angelico, Feb 18, 2013
7. Ian KellyGuest

On Mon, Feb 18, 2013 at 3:01 PM, Chris Angelico <> wrote:
> On Tue, Feb 19, 2013 at 8:54 AM, Ian Kelly <> wrote:
>> Well, I don't see anything that looks especially slow in that code,
>> but the algorithm that you're using is not very efficient. I rewrote
>> it using dynamic programming (details left as an exercise), which got
>> the runtime down to about 4 seconds.

>
> Did it involve a dictionary, mapping a value to its count, so that any
> time you hit a value you've seen, you can short-cut it? That was my
> first optimization consideration, though I didn't implement it in any
> version, so as to keep the timings comparable.

Ayup.

Ian Kelly, Feb 18, 2013
8. Alexander BlinneGuest

Am 18.02.2013 20:13, schrieb John Immarino:
> I coded a Python solution for Problem #14 on the Project Euler website. I was very surprised to find that it took 107 sec. to run even though it's a pretty simple program. I also coded an equivalent solution for the problem in the old MSDOS basic. (That's the 16 bit app of 1980s vintage.) It ran in 56 sec. Is there a flaw in my coding, or is Python really this slow in this particular application. MSDOS Basic usually runs at a snails pace compared to Python.

> max=0
> m=0
> while m<=1000000:
> m+=1
> count=0
> n=m
> while n!=1:
> count+=1
> if n%2==0:
> n=n//2
> else:
> n=3*n+1
> if count>max:
> max=count
> num=m
> print(num,max)

I cannot compare my timings with basic but python 2.7.3 and python 3.2.3
are both equally slow hier (~50 sec).
pypy is a lot faster (only some old version 1.7.0, current versions
should be faster still) with about 5 sec.

The following C-Program:

#include <stdio.h>

int main(void) {

int max = 0;
int m = 0;
long int n;
int count;
int num;

while(m<=1000000) {
m++;
n = m;
count = 0;

while(n != 1) {
count++;
if(n % 2 == 0) {
n = n / 2;
}
else {
n = n*3 + 1;
}
}

if(count > max) {
max = count;
num = m;
}
}

printf("%d, %d\n", num, max);
}

Does the job in just under 1 sec.

Greetings
Alexander

Alexander Blinne, Feb 19, 2013
9. Dennis Lee BieberGuest

On Mon, 18 Feb 2013 11:13:04 -0800 (PST), John Immarino
<> declaimed the following in gmane.comp.python.general:

> I coded a Python solution for Problem #14 on the Project Euler website. I was very surprised to find that it took 107 sec. to run even though it's a pretty simple program. I also coded an equivalent solution for the problem in the old MSDOS basic. (That's the 16 bit app of 1980s vintage.) It ran in 56 sec. Is there a flaw in my coding, or is Python really this slow in this particular application. MSDOS Basic usually runs at a snails pace compared to Python.
>

<snip>
> max=0

"max" is a bad name -- it masks the built-in max() function

> m=0
> while m<=1000000:
> m+=1

Since "m" is only modified here and has a value of 1 for the first
pass through, you can replace those three lines with

for m in xrange(1, 1000001): #python 2.x, just use range() for 3.x

> count=0
> n=m

> while n!=1:
> count+=1
> if n%2==0:
> n=n//2
> else:
> n=3*n+1

Avoid the comparison to 0 by reversing the then/else actions... Any
0 result is false.

-=-=-=-=-
import time

mx = 0

start = time.time()
for m in xrange(1, 1000001):
count = 0
n = m
while n > 1:
count += 1
if n % 2: # 0 means false
n = 3 * n + 1
else:
n = n // 2

if count > mx:
mx, num = count, m

end = time.time()

print num, mx
print end-start
-=-=-=-=-
Microsoft Windows XP [Version 5.1.2600]

E:\UserData\Wulfraed\My Documents>cd "Python Progs"

E:\UserData\Wulfraed\My Documents\Python Progs>Script1.py
837799 524
83.2030000687

E:\UserData\Wulfraed\My Documents\Python Progs>

--
Wulfraed Dennis Lee Bieber AF6VN
HTTP://wlfraed.home.netcom.com/

Dennis Lee Bieber, Feb 19, 2013
10. Terry ReedyGuest

On 2/18/2013 2:13 PM, John Immarino wrote:
> I coded a Python solution for Problem #14 on the Project Euler
> website. I was very surprised to find that it took 107 sec. to run
> even though it's a pretty simple program. I also coded an equivalent
> solution for the problem in the old MSDOS basic. (That's the 16 bit
> app of 1980s vintage.) It ran in 56 sec. Is there a flaw in my
> coding, or is Python really this slow in this particular application.
> MSDOS Basic usually runs at a snails pace compared to Python.

I find this surprising too. I am also surprised that it even works,
given that the highest intermediate value is about 57 billion and I do
not remember that Basic had infinite precision ints.

> The following iterative sequence is defined for the set of positive
> integers:
>
> n â†’ n/2 (n is even) n â†’ 3n + 1 (n is odd)

Note that if n is odd, 3n + 1 is even (and not 1!), so one may take two
steps with (3n + 1)/2.

> Using the rule above and starting with 13, we generate the following
> sequence: 13 â†’ 40 â†’ 20 â†’ 10 â†’ 5 â†’16 â†’ 8 â†’ 4 â†’ 2 â†’ 1
>
> It can be seen that this sequence (starting at 13 and finishing at 1)
> contains 10 terms. Although it has not been proved yet (Collatz
> Problem), it is thought that all starting numbers finish at 1.

https://en.wikipedia.org/wiki/Collatz_conjecture

> Which starting number, under one million, produces the longest
> chain?

I suppose 'print(837799)' would not count as a proper solution.

> NOTE: Once the chain starts the terms are allowed to go above one
> million.

Here is my slightly revised code with timings on a good, 20 month old
win 7 machine.

from time import time
start = time()

num, max = 0, 0
for m in range(1, 1000001):
n = m
count=0
while n !=1:
if n & 1: #n % 2:
n = (3*n + 1) // 2
count += 2
else:
n = n//2
count += 1
if count > max:
num = m
max = count

print(num, max , time()-start)

# original: 837799, 524 steps, 53.9 secs
# for ... range: 52.3
# reverse inner if 49.0
# double step 39.1
# n & 1 instead of n % 2 for test: 36.0, 36.0, 35.9
# n>>1 instead of n//2: 34.7, 36.1, 36.2;
# this may be internally optimized, so skip

I do not see any fluff left to remove, unless one takes the major step
of saving already calculated values in an array.

Since the highest intermediate value of n is 56991483520 (445245965
*2**7, from adding "if n > maxn: maxn = n" to the odd branch, before
dividing by 2), the array would have to be limited to a much lower
value, say a few million.

--
Terry Jan Reedy

Terry Reedy, Feb 19, 2013
11. John ImmarinoGuest

On Monday, February 18, 2013 2:58:57 PM UTC-7, Chris Angelico wrote:
> On Tue, Feb 19, 2013 at 8:56 AM, Chris Angelico <> wrote:
>
> > On Tue, Feb 19, 2013 at 8:55 AM, Chris Angelico <> wrote:

>
> >> How long did your BASIC version take, and how long did the Python

>
> >> version on the same hardware?

>
> >

>

>
> > Which Python version didyou use?

>
> >

>
> > ChrisA

>
>
>
> Doh. I'm having a great day of not reading properly, today. (I blame
>
> checking mail on the bus, it took me over an hour to read this one
>
> message and I'd forgotten the subject line by the time I got to the
>
> end.) Python 3.3, right there in the header. Disregard me!
>
>
>
> ChrisA

Thanks,Chris. I'm a newbie to Python and didn't realize that it's not as good at number crunching as some of the others. It does seem to do better than Basic with numbers in lists as opposed to arrays in Basic.

John Immarino, Feb 19, 2013
12. John ImmarinoGuest

On Monday, February 18, 2013 2:58:57 PM UTC-7, Chris Angelico wrote:
> On Tue, Feb 19, 2013 at 8:56 AM, Chris Angelico <> wrote:
>
> > On Tue, Feb 19, 2013 at 8:55 AM, Chris Angelico <> wrote:

>
> >> How long did your BASIC version take, and how long did the Python

>
> >> version on the same hardware?

>
> >

>

>
> > Which Python version didyou use?

>
> >

>
> > ChrisA

>
>
>
> Doh. I'm having a great day of not reading properly, today. (I blame
>
> checking mail on the bus, it took me over an hour to read this one
>
> message and I'd forgotten the subject line by the time I got to the
>
> end.) Python 3.3, right there in the header. Disregard me!
>
>
>
> ChrisA

Thanks,Chris. I'm a newbie to Python and didn't realize that it's not as good at number crunching as some of the others. It does seem to do better than Basic with numbers in lists as opposed to arrays in Basic.

John Immarino, Feb 19, 2013
13. John ImmarinoGuest

>
> > max=0

>
>
>
> "max" is a bad name -- it masks the built-in max() function
>
>
>
> > m=0

>
> > while m<=1000000:

>
> > m+=1

>
>
>
> Since "m" is only modified here and has a value of 1 for the first
>
> pass through, you can replace those three lines with
>
>
>
> for m in xrange(1, 1000001): #python 2.x, just use range() for 3.x
>
>
>
> > count=0

>
> > n=m

>
>
>
> > while n!=1:

>
> > count+=1

>
> > if n%2==0:

>
> > n=n//2

>
> > else:

>
> > n=3*n+1

>
>
>
> Avoid the comparison to 0 by reversing the then/else actions... Any
>
> 0 result is false.
>
>
>
> -=-=-=-=-
>
> import time
>
>
>
> mx = 0
>
>
>
> start = time.time()
>
> for m in xrange(1, 1000001):
>
> count = 0
>
> n = m
>
> while n > 1:
>
> count += 1
>
> if n % 2: # 0 means false
>
> n = 3 * n + 1
>
> else:
>
> n = n // 2
>
>
>
> if count > mx:
>
> mx, num = count, m
>
>
>
> end = time.time()
>
>
>
> print num, mx
>
> print end-start
>
> -=-=-=-=-
>
> Microsoft Windows XP [Version 5.1.2600]
>
> (C) Copyright 1985-2001 Microsoft Corp.
>
>
>
> E:\UserData\Wulfraed\My Documents>cd "Python Progs"
>
>
>
> E:\UserData\Wulfraed\My Documents\Python Progs>Script1.py
>
> 837799 524
>
> 83.2030000687
>
>
>
> E:\UserData\Wulfraed\My Documents\Python Progs>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
>
> Wulfraed Dennis Lee Bieber AF6VN
>
> HTTP://wlfraed.home.netcom.com/

Thanks, your suggestions are well taken.

John Immarino, Feb 19, 2013
14. John ImmarinoGuest

>
> > max=0

>
>
>
> "max" is a bad name -- it masks the built-in max() function
>
>
>
> > m=0

>
> > while m<=1000000:

>
> > m+=1

>
>
>
> Since "m" is only modified here and has a value of 1 for the first
>
> pass through, you can replace those three lines with
>
>
>
> for m in xrange(1, 1000001): #python 2.x, just use range() for 3.x
>
>
>
> > count=0

>
> > n=m

>
>
>
> > while n!=1:

>
> > count+=1

>
> > if n%2==0:

>
> > n=n//2

>
> > else:

>
> > n=3*n+1

>
>
>
> Avoid the comparison to 0 by reversing the then/else actions... Any
>
> 0 result is false.
>
>
>
> -=-=-=-=-
>
> import time
>
>
>
> mx = 0
>
>
>
> start = time.time()
>
> for m in xrange(1, 1000001):
>
> count = 0
>
> n = m
>
> while n > 1:
>
> count += 1
>
> if n % 2: # 0 means false
>
> n = 3 * n + 1
>
> else:
>
> n = n // 2
>
>
>
> if count > mx:
>
> mx, num = count, m
>
>
>
> end = time.time()
>
>
>
> print num, mx
>
> print end-start
>
> -=-=-=-=-
>
> Microsoft Windows XP [Version 5.1.2600]
>
> (C) Copyright 1985-2001 Microsoft Corp.
>
>
>
> E:\UserData\Wulfraed\My Documents>cd "Python Progs"
>
>
>
> E:\UserData\Wulfraed\My Documents\Python Progs>Script1.py
>
> 837799 524
>
> 83.2030000687
>
>
>
> E:\UserData\Wulfraed\My Documents\Python Progs>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
>
> Wulfraed Dennis Lee Bieber AF6VN
>
> HTTP://wlfraed.home.netcom.com/

Thanks, your suggestions are well taken.

John Immarino, Feb 19, 2013
15. Chris AngelicoGuest

On Tue, Feb 19, 2013 at 12:39 PM, John Immarino <> wrote:
> On Monday, February 18, 2013 2:58:57 PM UTC-7, Chris Angelico wrote:
>> On Tue, Feb 19, 2013 at 8:56 AM, Chris Angelico <> wrote:
>>
>> > On Tue, Feb 19, 2013 at 8:55 AM, Chris Angelico <> wrote:

>>
>> >> How long did your BASIC version take, and how long did the Python

>>
>> >> version on the same hardware?

>>
>> >

>>

>>
>> > Which Python version didyou use?

>>
>> >

>>
>> > ChrisA

>>
>>
>>
>> Doh. I'm having a great day of not reading properly, today. (I blame
>>
>> checking mail on the bus, it took me over an hour to read this one
>>
>> message and I'd forgotten the subject line by the time I got to the
>>
>> end.) Python 3.3, right there in the header. Disregard me!
>>
>>
>>
>> ChrisA

>
> Thanks,Chris. I'm a newbie to Python and didn't realize that it's not as good at number crunching as some of the others. It does seem to do better than Basic with numbers in lists as opposed to arrays in Basic.

Yes, Python is excellent at data handling. I'll cheerfully use Python
to manipulate huge lists or arrays, and its performance at that is
usually well within the "good enough" range (for instance, anything
that manipulates the file system will be waiting on my disks, not on
Python). It's an excellent tool in the toolkit, just not the one
solution to everything. (Nothing's that!)

ChrisA

Chris Angelico, Feb 19, 2013
16. Nick MellorGuest

Hi John,

Thanks for the problem. I've been writing Python for about 4 years now and am beginning to feel like I'm writing much better Python code.

Python does fine on this problem if you play to its strengths. The following uses dictionary lookups to store previously computed sequence lengths, thus saving a lot of work. The problem is very "sparse", i.e. there are huge gaps between numbers that are actually used in the solution, making dictionaries a better fit than lists.

This code crosses the line in under 3s on a 64-bit laptop. MS-DOS BASIC anyone?

I tried precomputing powers of 2 and multiples of 2, but to my surprise it made very little difference to timings. Even though precomputing n//2 is fast, I think again this is because the problem is sparse and the time the computer saves is not offset by the cost of precomputing many multiples of 2 that are never needed.

Best wishes,

Nick

And the winner is 837799 with sequence length 524
Time (s): 2.924168109893799
Sequence is:
[837799, 2513398, 1256699, 3770098, 1885049, 5655148, 2827574, 1413787, 4241362, 2120681, 6362044, 3181022, 1590511, 4771534, 2385767, 7157302, 3578651, 10735954, 5367977, 16103932, 8051966, 4025983, 12077950, 6038975, 18116926, 9058463, 27175390, 13587695, 40763086, 20381543, 61144630, 30572315, 91716946, 45858473, 137575420, 68787710, 34393855, 103181566, 51590783, 154772350, 77386175, 232158526, 116079263, 348237790, 174118895, 522356686, 261178343, 783535030, 391767515, 1175302546, 587651273, 1762953820, 881476910, 440738455, 1322215366, 661107683, 1983323050, 991661525, 2974984576, 1487492288, 743746144, 371873072, 185936536, 92968268, 46484134, 23242067, 69726202, 34863101, 104589304, 52294652, 26147326, 13073663, 39220990, 19610495, 58831486, 29415743, 88247230, 44123615, 132370846, 66185423, 198556270, 99278135, 297834406, 148917203, 446751610, 223375805, 670127416, 335063708, 167531854, 83765927, 251297782, 125648891, 376946674, 188473337, 565420012, 282710006, 141355003, 424065010, 212032505, 636097516, 318048758, 159024379,477073138, 238536569, 715609708, 357804854, 178902427, 536707282, 268353641, 805060924, 402530462, 201265231, 603795694, 301897847, 905693542, 452846771, 1358540314, 679270157, 2037810472, 1018905236, 509452618, 254726309, 764178928, 382089464, 191044732, 95522366, 47761183, 143283550, 71641775, 214925326, 107462663, 322387990, 161193995, 483581986, 241790993, 725372980, 362686490, 181343245, 544029736, 272014868, 136007434, 68003717, 204011152,102005576, 51002788, 25501394, 12750697, 38252092, 19126046, 9563023, 28689070, 14344535, 43033606, 21516803, 64550410, 32275205, 96825616, 48412808,24206404, 12103202, 6051601, 18154804, 9077402, 4538701, 13616104, 6808052, 3404026, 1702013, 5106040, 2553020, 1276510, 638255, 1914766, 957383, 2872150, 1436075, 4308226, 2154113, 6462340, 3231170, 1615585, 4846756, 2423378, 1211689, 3635068, 1817534, 908767, 2726302, 1363151, 4089454, 2044727, 6134182, 3067091, 9201274, 4600637, 13801912, 6900956, 3450478, 1725239, 5175718, 2587859, 7763578, 3881789, 11645368, 5822684, 2911342, 1455671, 4367014, 2183507, 6550522, 3275261, 9825784, 4912892, 2456446, 1228223, 3684670,1842335, 5527006, 2763503, 8290510, 4145255, 12435766, 6217883, 18653650, 9326825, 27980476, 13990238, 6995119, 20985358, 10492679, 31478038, 15739019, 47217058, 23608529, 70825588, 35412794, 17706397, 53119192, 26559596, 13279798, 6639899, 19919698, 9959849, 29879548, 14939774, 7469887, 22409662, 11204831, 33614494, 16807247, 50421742, 25210871, 75632614, 37816307, 113448922, 56724461, 170173384, 85086692, 42543346, 21271673, 63815020, 31907510, 15953755, 47861266, 23930633, 71791900, 35895950, 17947975, 53843926, 26921963, 80765890, 40382945, 121148836, 60574418, 30287209, 90861628, 45430814, 22715407, 68146222, 34073111, 102219334, 51109667, 153329002, 76664501, 229993504, 114996752, 57498376, 28749188, 14374594, 7187297, 21561892, 10780946, 5390473, 16171420, 8085710, 4042855, 12128566, 6064283, 18192850, 9096425, 27289276, 13644638, 6822319, 20466958, 10233479, 30700438, 15350219, 46050658, 23025329, 69075988, 34537994, 17268997, 51806992, 25903496, 12951748, 6475874, 3237937, 9713812, 4856906, 2428453, 7285360, 3642680, 1821340, 910670, 455335, 1366006, 683003, 2049010, 1024505, 3073516, 1536758, 768379, 2305138, 1152569, 3457708, 1728854, 864427, 2593282, 1296641, 3889924, 1944962, 972481, 2917444, 1458722, 729361, 2188084, 1094042, 547021, 1641064, 820532, 410266, 205133, 615400, 307700, 153850, 76925, 230776, 115388, 57694, 28847, 86542, 43271, 129814, 64907, 194722, 97361, 292084, 146042, 73021, 219064, 109532, 54766, 27383, 82150, 41075, 123226, 61613, 184840, 92420, 46210, 23105, 69316, 34658, 17329, 51988, 25994, 12997, 38992, 19496, 9748, 4874, 2437, 7312, 3656, 1828, 914, 457, 1372, 686, 343, 1030, 515, 1546, 773, 2320, 1160, 580, 290, 145, 436, 218, 109, 328, 164, 82, 41, 124, 62, 31, 94, 47, 142, 71, 214, 107, 322, 161, 484, 242, 121, 364, 182, 91, 274, 137, 412, 206, 103, 310, 155, 466, 233, 700, 350, 175, 526, 263, 790, 395, 1186, 593, 1780, 890, 445, 1336, 668, 334, 167, 502, 251, 754, 377, 1132,566, 283, 850, 425, 1276, 638, 319, 958, 479, 1438, 719, 2158, 1079, 3238,1619, 4858, 2429, 7288, 3644, 1822, 911, 2734, 1367, 4102, 2051, 6154, 3077, 9232, 4616, 2308, 1154, 577, 1732, 866, 433, 1300, 650, 325, 976, 488, 244, 122, 61, 184, 92, 46, 23, 70, 35, 106, 53, 160, 80, 40, 20, 10, 5, 16, 8, 4, 2, 1]
Sparsity calculations...
Computed sequence lengths 2168611
Largest term: 56991483520
Test range: 1 1000000
Biggest gap: 4508198208
Sparsity: 0.00175%

# If True, will precompute powers of 2 and multiples of 2
# in practice this made little difference on 64-bit hardware
OPTIMISE = True

def build_sequence(n):
"""return sequence as a list given the starting number
Uses the trail of data left by compute_sequence"""
tmp = compute_sequence(n)
sequence = []
while n:
sequence.append(n)
n = next_num[n]
return sequence

def compute_sequence(n):
"""lazily compute sequences for Collatz problem"""
if n in seqlength:
return seqlength[n]
if n not in next_num:
# NOTE: (some) evens are pre-computed
next_num[n] = 3 * n + 1 if n % 2 else n // 2
seqlength[n] = 1 + compute_sequence(next_num[n])
return seqlength[n]

import time
start = time.time()

highest_number = int(1000000)
highest_term = highest_number * 3 + 1
highest_term += 1 if highest_term % 2 else 0

next_num = {2:1}
if OPTIMISE:
# quickly pre-compute (some of) the evens (used for n = n//2 if n is even)
# how many should we precompute? Any mathematicians?
doubles = range(2, highest_term, 2)
numbers = range(1, highest_term//2)
next_num = dict(zip(doubles, numbers))
# mark 1 as the end-point of any sequence
next_num[1] = 0

# initialise the sequence lengths
seqlength = {}
seqlength[1] = 0
seqlength[2] = 1
if OPTIMISE:
# powers of 2 are trivial: 2**n has sequence length n
n = 2
pwr = 4
while pwr < highest_term:
seqlength[pwr] = n
pwr = pwr * 2
n += 1
max_length = 0
for n in range(3, highest_number + 1):
length = compute_sequence(n)
if length > max_length:
max_length = length
winning_number = n
print ("And the winner is {0} with sequence length {1}".format(winning_number, max_length))
end = time.time()
print ("Time (s): ", (end-start))

print ("Sequence is:")
print (build_sequence(winning_number))

# Sparsity calculation
sorted_seqlengths = sorted(seqlength.keys())
print ("Sparsity calculations...")
print ("Computed sequence lengths", len(seqlength))
largest_term = sorted_seqlengths[-1]
print ("Largest term: ", largest_term)
print ("Test range: ", 1, highest_number)
gaps = (second - first for first, second in zip(sorted_seqlengths[0:-1], sorted_seqlengths[1:]))
biggest_gap = 0
for n in gaps:
if biggest_gap < n:
biggest_gap = n
print ("Biggest gap: ", n)
print ("Sparsity: {0:.5f}%".format(highest_number / largest_term * 100))

On Tuesday, 19 February 2013 14:01:31 UTC+11, Chris Angelico wrote:
> On Tue, Feb 19, 2013 at 12:39 PM, John Immarino <> wrote:
>
> > On Monday, February 18, 2013 2:58:57 PM UTC-7, Chris Angelico wrote:

>
> >> On Tue, Feb 19, 2013 at 8:56 AM, Chris Angelico <> wrote:

>
> >>

>
> >> > On Tue, Feb 19, 2013 at 8:55 AM, Chris Angelico <> wrote:

>
> >>

>
> >> >> How long did your BASIC version take, and how long did the Python

>
> >>

>
> >> >> version on the same hardware?

>
> >>

>
> >> >

>
> >>

>

>
> >>

>
> >> > Which Python version didyou use?

>
> >>

>
> >> >

>
> >>

>
> >> > ChrisA

>
> >>

>
> >>

>
> >>

>
> >> Doh. I'm having a great day of not reading properly, today. (I blame

>
> >>

>
> >> checking mail on the bus, it took me over an hour to read this one

>
> >>

>
> >> message and I'd forgotten the subject line by the time I got to the

>
> >>

>
> >> end.) Python 3.3, right there in the header. Disregard me!

>
> >>

>
> >>

>
> >>

>
> >> ChrisA

>
> >

>
> > Thanks,Chris. I'm a newbie to Python and didn't realize that it's not as good at number crunching as some of the others. It does seem to do betterthan Basic with numbers in lists as opposed to arrays in Basic.

>
>
>
> Yes, Python is excellent at data handling. I'll cheerfully use Python
>
> to manipulate huge lists or arrays, and its performance at that is
>
> usually well within the "good enough" range (for instance, anything
>
> that manipulates the file system will be waiting on my disks, not on
>
> Python). It's an excellent tool in the toolkit, just not the one
>
> solution to everything. (Nothing's that!)
>
>
>
> ChrisA

Nick Mellor, Feb 19, 2013
17. Nick MellorGuest

Hi John,

Thanks for the problem. I've been writing Python for about 4 years now and am beginning to feel like I'm writing much better Python code.

Python does fine on this problem if you play to its strengths. The following uses dictionary lookups to store previously computed sequence lengths, thus saving a lot of work. The problem is very "sparse", i.e. there are huge gaps between numbers that are actually used in the solution, making dictionaries a better fit than lists.

This code crosses the line in under 3s on a 64-bit laptop. MS-DOS BASIC anyone?

I tried precomputing powers of 2 and multiples of 2, but to my surprise it made very little difference to timings. Even though precomputing n//2 is fast, I think again this is because the problem is sparse and the time the computer saves is not offset by the cost of precomputing many multiples of 2 that are never needed.

Best wishes,

Nick

And the winner is 837799 with sequence length 524
Time (s): 2.924168109893799
Sequence is:
[837799, 2513398, 1256699, 3770098, 1885049, 5655148, 2827574, 1413787, 4241362, 2120681, 6362044, 3181022, 1590511, 4771534, 2385767, 7157302, 3578651, 10735954, 5367977, 16103932, 8051966, 4025983, 12077950, 6038975, 18116926, 9058463, 27175390, 13587695, 40763086, 20381543, 61144630, 30572315, 91716946, 45858473, 137575420, 68787710, 34393855, 103181566, 51590783, 154772350, 77386175, 232158526, 116079263, 348237790, 174118895, 522356686, 261178343, 783535030, 391767515, 1175302546, 587651273, 1762953820, 881476910, 440738455, 1322215366, 661107683, 1983323050, 991661525, 2974984576, 1487492288, 743746144, 371873072, 185936536, 92968268, 46484134, 23242067, 69726202, 34863101, 104589304, 52294652, 26147326, 13073663, 39220990, 19610495, 58831486, 29415743, 88247230, 44123615, 132370846, 66185423, 198556270, 99278135, 297834406, 148917203, 446751610, 223375805, 670127416, 335063708, 167531854, 83765927, 251297782, 125648891, 376946674, 188473337, 565420012, 282710006, 141355003, 424065010, 212032505, 636097516, 318048758, 159024379,477073138, 238536569, 715609708, 357804854, 178902427, 536707282, 268353641, 805060924, 402530462, 201265231, 603795694, 301897847, 905693542, 452846771, 1358540314, 679270157, 2037810472, 1018905236, 509452618, 254726309, 764178928, 382089464, 191044732, 95522366, 47761183, 143283550, 71641775, 214925326, 107462663, 322387990, 161193995, 483581986, 241790993, 725372980, 362686490, 181343245, 544029736, 272014868, 136007434, 68003717, 204011152,102005576, 51002788, 25501394, 12750697, 38252092, 19126046, 9563023, 28689070, 14344535, 43033606, 21516803, 64550410, 32275205, 96825616, 48412808,24206404, 12103202, 6051601, 18154804, 9077402, 4538701, 13616104, 6808052, 3404026, 1702013, 5106040, 2553020, 1276510, 638255, 1914766, 957383, 2872150, 1436075, 4308226, 2154113, 6462340, 3231170, 1615585, 4846756, 2423378, 1211689, 3635068, 1817534, 908767, 2726302, 1363151, 4089454, 2044727, 6134182, 3067091, 9201274, 4600637, 13801912, 6900956, 3450478, 1725239, 5175718, 2587859, 7763578, 3881789, 11645368, 5822684, 2911342, 1455671, 4367014, 2183507, 6550522, 3275261, 9825784, 4912892, 2456446, 1228223, 3684670,1842335, 5527006, 2763503, 8290510, 4145255, 12435766, 6217883, 18653650, 9326825, 27980476, 13990238, 6995119, 20985358, 10492679, 31478038, 15739019, 47217058, 23608529, 70825588, 35412794, 17706397, 53119192, 26559596, 13279798, 6639899, 19919698, 9959849, 29879548, 14939774, 7469887, 22409662, 11204831, 33614494, 16807247, 50421742, 25210871, 75632614, 37816307, 113448922, 56724461, 170173384, 85086692, 42543346, 21271673, 63815020, 31907510, 15953755, 47861266, 23930633, 71791900, 35895950, 17947975, 53843926, 26921963, 80765890, 40382945, 121148836, 60574418, 30287209, 90861628, 45430814, 22715407, 68146222, 34073111, 102219334, 51109667, 153329002, 76664501, 229993504, 114996752, 57498376, 28749188, 14374594, 7187297, 21561892, 10780946, 5390473, 16171420, 8085710, 4042855, 12128566, 6064283, 18192850, 9096425, 27289276, 13644638, 6822319, 20466958, 10233479, 30700438, 15350219, 46050658, 23025329, 69075988, 34537994, 17268997, 51806992, 25903496, 12951748, 6475874, 3237937, 9713812, 4856906, 2428453, 7285360, 3642680, 1821340, 910670, 455335, 1366006, 683003, 2049010, 1024505, 3073516, 1536758, 768379, 2305138, 1152569, 3457708, 1728854, 864427, 2593282, 1296641, 3889924, 1944962, 972481, 2917444, 1458722, 729361, 2188084, 1094042, 547021, 1641064, 820532, 410266, 205133, 615400, 307700, 153850, 76925, 230776, 115388, 57694, 28847, 86542, 43271, 129814, 64907, 194722, 97361, 292084, 146042, 73021, 219064, 109532, 54766, 27383, 82150, 41075, 123226, 61613, 184840, 92420, 46210, 23105, 69316, 34658, 17329, 51988, 25994, 12997, 38992, 19496, 9748, 4874, 2437, 7312, 3656, 1828, 914, 457, 1372, 686, 343, 1030, 515, 1546, 773, 2320, 1160, 580, 290, 145, 436, 218, 109, 328, 164, 82, 41, 124, 62, 31, 94, 47, 142, 71, 214, 107, 322, 161, 484, 242, 121, 364, 182, 91, 274, 137, 412, 206, 103, 310, 155, 466, 233, 700, 350, 175, 526, 263, 790, 395, 1186, 593, 1780, 890, 445, 1336, 668, 334, 167, 502, 251, 754, 377, 1132,566, 283, 850, 425, 1276, 638, 319, 958, 479, 1438, 719, 2158, 1079, 3238,1619, 4858, 2429, 7288, 3644, 1822, 911, 2734, 1367, 4102, 2051, 6154, 3077, 9232, 4616, 2308, 1154, 577, 1732, 866, 433, 1300, 650, 325, 976, 488, 244, 122, 61, 184, 92, 46, 23, 70, 35, 106, 53, 160, 80, 40, 20, 10, 5, 16, 8, 4, 2, 1]
Sparsity calculations...
Computed sequence lengths 2168611
Largest term: 56991483520
Test range: 1 1000000
Biggest gap: 4508198208
Sparsity: 0.00175%

# If True, will precompute powers of 2 and multiples of 2
# in practice this made little difference on 64-bit hardware
OPTIMISE = True

def build_sequence(n):
"""return sequence as a list given the starting number
Uses the trail of data left by compute_sequence"""
tmp = compute_sequence(n)
sequence = []
while n:
sequence.append(n)
n = next_num[n]
return sequence

def compute_sequence(n):
"""lazily compute sequences for Collatz problem"""
if n in seqlength:
return seqlength[n]
if n not in next_num:
# NOTE: (some) evens are pre-computed
next_num[n] = 3 * n + 1 if n % 2 else n // 2
seqlength[n] = 1 + compute_sequence(next_num[n])
return seqlength[n]

import time
start = time.time()

highest_number = int(1000000)
highest_term = highest_number * 3 + 1
highest_term += 1 if highest_term % 2 else 0

next_num = {2:1}
if OPTIMISE:
# quickly pre-compute (some of) the evens (used for n = n//2 if n is even)
# how many should we precompute? Any mathematicians?
doubles = range(2, highest_term, 2)
numbers = range(1, highest_term//2)
next_num = dict(zip(doubles, numbers))
# mark 1 as the end-point of any sequence
next_num[1] = 0

# initialise the sequence lengths
seqlength = {}
seqlength[1] = 0
seqlength[2] = 1
if OPTIMISE:
# powers of 2 are trivial: 2**n has sequence length n
n = 2
pwr = 4
while pwr < highest_term:
seqlength[pwr] = n
pwr = pwr * 2
n += 1
max_length = 0
for n in range(3, highest_number + 1):
length = compute_sequence(n)
if length > max_length:
max_length = length
winning_number = n
print ("And the winner is {0} with sequence length {1}".format(winning_number, max_length))
end = time.time()
print ("Time (s): ", (end-start))

print ("Sequence is:")
print (build_sequence(winning_number))

# Sparsity calculation
sorted_seqlengths = sorted(seqlength.keys())
print ("Sparsity calculations...")
print ("Computed sequence lengths", len(seqlength))
largest_term = sorted_seqlengths[-1]
print ("Largest term: ", largest_term)
print ("Test range: ", 1, highest_number)
gaps = (second - first for first, second in zip(sorted_seqlengths[0:-1], sorted_seqlengths[1:]))
biggest_gap = 0
for n in gaps:
if biggest_gap < n:
biggest_gap = n
print ("Biggest gap: ", n)
print ("Sparsity: {0:.5f}%".format(highest_number / largest_term * 100))

On Tuesday, 19 February 2013 14:01:31 UTC+11, Chris Angelico wrote:
> On Tue, Feb 19, 2013 at 12:39 PM, John Immarino <> wrote:
>
> > On Monday, February 18, 2013 2:58:57 PM UTC-7, Chris Angelico wrote:

>
> >> On Tue, Feb 19, 2013 at 8:56 AM, Chris Angelico <> wrote:

>
> >>

>
> >> > On Tue, Feb 19, 2013 at 8:55 AM, Chris Angelico <> wrote:

>
> >>

>
> >> >> How long did your BASIC version take, and how long did the Python

>
> >>

>
> >> >> version on the same hardware?

>
> >>

>
> >> >

>
> >>

>

>
> >>

>
> >> > Which Python version didyou use?

>
> >>

>
> >> >

>
> >>

>
> >> > ChrisA

>
> >>

>
> >>

>
> >>

>
> >> Doh. I'm having a great day of not reading properly, today. (I blame

>
> >>

>
> >> checking mail on the bus, it took me over an hour to read this one

>
> >>

>
> >> message and I'd forgotten the subject line by the time I got to the

>
> >>

>
> >> end.) Python 3.3, right there in the header. Disregard me!

>
> >>

>
> >>

>
> >>

>
> >> ChrisA

>
> >

>
> > Thanks,Chris. I'm a newbie to Python and didn't realize that it's not as good at number crunching as some of the others. It does seem to do betterthan Basic with numbers in lists as opposed to arrays in Basic.

>
>
>
> Yes, Python is excellent at data handling. I'll cheerfully use Python
>
> to manipulate huge lists or arrays, and its performance at that is
>
> usually well within the "good enough" range (for instance, anything
>
> that manipulates the file system will be waiting on my disks, not on
>
> Python). It's an excellent tool in the toolkit, just not the one
>
> solution to everything. (Nothing's that!)
>
>
>
> ChrisA

Nick Mellor, Feb 19, 2013
18. Terry ReedyGuest

On 2/18/2013 4:55 PM, Chris Angelico wrote:

> Running under Python 2.6, both your version and mine take about 90
> seconds to run. But under Python 3.3, where (among other things)
> range() yields values lazily, my version is significantly faster than
> yours. BUT! Both versions, under 3.3, are significantly *slower* than
> under 2.6. My first thought is that it's because Py2 has different
> types for 'int' and 'long', and Py3 doesn't (effectively, everything's
> a long), so I added an L suffix to every number and ran each of them
> under 2.6 again. Seems that was the bulk of the difference, though not
> all.
>
> Pythonistas, does this count as a regression, or is Python
> sufficiently "not a number crunching language" that we don't care?

Both. This brute-force algorithm is almost pure number crunching. This
is the sort of thing pypy and cython are good at speeding up. (I leave
out numpy only because it is not an array-oriented problem.)

I put a counter in the inner loop of my improved version the does
(3*n+1)//2 in one step and got 87 826 478 in 40 seconds (without the
counter). That is 2 million loops per second and each loop does a
compare, one or two integer ops, and creates and releases one or two ints.

If I were doing a lot of int crunching like this with CPython and were
building my own interpreter, I would greatly expand the range of
pre-allocated 'small' ints to avoid some of the repeated allocation and
de-allocation. On a multi-gibibyte machine, allocating up to 1000000
instead of 256 would be feasible.

As Ian noted, an intelligent algorithm in CPython can match pypy and is
in the ballpark of C, but is much easier to write in Python than C. It
is possible that Ian's code could be improved further. A pre-allocated
arrray + dict might be faster. Whenever an odd value is filled in,
powers of 2 times that value can also be.

--
Terry Jan Reedy

Terry Reedy, Feb 19, 2013
19. Anssi SaariGuest

John Immarino <> writes:

> I coded a Python solution for Problem #14 on the Project Euler
> website. I was very surprised to find that it took 107 sec. to run
> even though it's a pretty simple program. I also coded an equivalent
> solution for the problem in the old MSDOS basic. (That's the 16 bit
> app of 1980s vintage.)

Just out of curiosity, can you post the basic version as well?

Anssi Saari, Feb 19, 2013
20. Serhiy StorchakaGuest

On 18.02.13 21:13, John Immarino wrote:
> max=0
> m=0
> while m<=1000000:
> m+=1
> count=0
> n=m
> while n!=1:
> count+=1
> if n%2==0:
> n=n//2
> else:
> n=3*n+1
> if count>max:
> max=count
> num=m
> print(num,max)

Some minor tips:

1. Use range() for m iteration.
2. Instead of "if n%2==0:" use just "if n%2:".
3. Convert all you code to a function. Python is a little faster with
locals than with globals.