Hi John,

Thanks for the problem. I've been writing Python for about 4 years now and am beginning to feel like I'm writing much better Python code.

Python does fine on this problem if you play to its strengths. The following uses dictionary lookups to store previously computed sequence lengths, thus saving a lot of work. The problem is very "sparse", i.e. there are huge gaps between numbers that are actually used in the solution, making dictionaries a better fit than lists.

This code crosses the line in under 3s on a 64-bit laptop. MS-DOS BASIC anyone?

I tried precomputing powers of 2 and multiples of 2, but to my surprise it made very little difference to timings. Even though precomputing n//2 is fast, I think again this is because the problem is sparse and the time the computer saves is not offset by the cost of precomputing many multiples of 2 that are never needed.

Best wishes,

Nick

And the winner is 837799 with sequence length 524

Time (s): 2.924168109893799

Sequence is:

[837799, 2513398, 1256699, 3770098, 1885049, 5655148, 2827574, 1413787, 4241362, 2120681, 6362044, 3181022, 1590511, 4771534, 2385767, 7157302, 3578651, 10735954, 5367977, 16103932, 8051966, 4025983, 12077950, 6038975, 18116926, 9058463, 27175390, 13587695, 40763086, 20381543, 61144630, 30572315, 91716946, 45858473, 137575420, 68787710, 34393855, 103181566, 51590783, 154772350, 77386175, 232158526, 116079263, 348237790, 174118895, 522356686, 261178343, 783535030, 391767515, 1175302546, 587651273, 1762953820, 881476910, 440738455, 1322215366, 661107683, 1983323050, 991661525, 2974984576, 1487492288, 743746144, 371873072, 185936536, 92968268, 46484134, 23242067, 69726202, 34863101, 104589304, 52294652, 26147326, 13073663, 39220990, 19610495, 58831486, 29415743, 88247230, 44123615, 132370846, 66185423, 198556270, 99278135, 297834406, 148917203, 446751610, 223375805, 670127416, 335063708, 167531854, 83765927, 251297782, 125648891, 376946674, 188473337, 565420012, 282710006, 141355003, 424065010, 212032505, 636097516, 318048758, 159024379,477073138, 238536569, 715609708, 357804854, 178902427, 536707282, 268353641, 805060924, 402530462, 201265231, 603795694, 301897847, 905693542, 452846771, 1358540314, 679270157, 2037810472, 1018905236, 509452618, 254726309, 764178928, 382089464, 191044732, 95522366, 47761183, 143283550, 71641775, 214925326, 107462663, 322387990, 161193995, 483581986, 241790993, 725372980, 362686490, 181343245, 544029736, 272014868, 136007434, 68003717, 204011152,102005576, 51002788, 25501394, 12750697, 38252092, 19126046, 9563023, 28689070, 14344535, 43033606, 21516803, 64550410, 32275205, 96825616, 48412808,24206404, 12103202, 6051601, 18154804, 9077402, 4538701, 13616104, 6808052, 3404026, 1702013, 5106040, 2553020, 1276510, 638255, 1914766, 957383, 2872150, 1436075, 4308226, 2154113, 6462340, 3231170, 1615585, 4846756, 2423378, 1211689, 3635068, 1817534, 908767, 2726302, 1363151, 4089454, 2044727, 6134182, 3067091, 9201274, 4600637, 13801912, 6900956, 3450478, 1725239, 5175718, 2587859, 7763578, 3881789, 11645368, 5822684, 2911342, 1455671, 4367014, 2183507, 6550522, 3275261, 9825784, 4912892, 2456446, 1228223, 3684670,1842335, 5527006, 2763503, 8290510, 4145255, 12435766, 6217883, 18653650, 9326825, 27980476, 13990238, 6995119, 20985358, 10492679, 31478038, 15739019, 47217058, 23608529, 70825588, 35412794, 17706397, 53119192, 26559596, 13279798, 6639899, 19919698, 9959849, 29879548, 14939774, 7469887, 22409662, 11204831, 33614494, 16807247, 50421742, 25210871, 75632614, 37816307, 113448922, 56724461, 170173384, 85086692, 42543346, 21271673, 63815020, 31907510, 15953755, 47861266, 23930633, 71791900, 35895950, 17947975, 53843926, 26921963, 80765890, 40382945, 121148836, 60574418, 30287209, 90861628, 45430814, 22715407, 68146222, 34073111, 102219334, 51109667, 153329002, 76664501, 229993504, 114996752, 57498376, 28749188, 14374594, 7187297, 21561892, 10780946, 5390473, 16171420, 8085710, 4042855, 12128566, 6064283, 18192850, 9096425, 27289276, 13644638, 6822319, 20466958, 10233479, 30700438, 15350219, 46050658, 23025329, 69075988, 34537994, 17268997, 51806992, 25903496, 12951748, 6475874, 3237937, 9713812, 4856906, 2428453, 7285360, 3642680, 1821340, 910670, 455335, 1366006, 683003, 2049010, 1024505, 3073516, 1536758, 768379, 2305138, 1152569, 3457708, 1728854, 864427, 2593282, 1296641, 3889924, 1944962, 972481, 2917444, 1458722, 729361, 2188084, 1094042, 547021, 1641064, 820532, 410266, 205133, 615400, 307700, 153850, 76925, 230776, 115388, 57694, 28847, 86542, 43271, 129814, 64907, 194722, 97361, 292084, 146042, 73021, 219064, 109532, 54766, 27383, 82150, 41075, 123226, 61613, 184840, 92420, 46210, 23105, 69316, 34658, 17329, 51988, 25994, 12997, 38992, 19496, 9748, 4874, 2437, 7312, 3656, 1828, 914, 457, 1372, 686, 343, 1030, 515, 1546, 773, 2320, 1160, 580, 290, 145, 436, 218, 109, 328, 164, 82, 41, 124, 62, 31, 94, 47, 142, 71, 214, 107, 322, 161, 484, 242, 121, 364, 182, 91, 274, 137, 412, 206, 103, 310, 155, 466, 233, 700, 350, 175, 526, 263, 790, 395, 1186, 593, 1780, 890, 445, 1336, 668, 334, 167, 502, 251, 754, 377, 1132,566, 283, 850, 425, 1276, 638, 319, 958, 479, 1438, 719, 2158, 1079, 3238,1619, 4858, 2429, 7288, 3644, 1822, 911, 2734, 1367, 4102, 2051, 6154, 3077, 9232, 4616, 2308, 1154, 577, 1732, 866, 433, 1300, 650, 325, 976, 488, 244, 122, 61, 184, 92, 46, 23, 70, 35, 106, 53, 160, 80, 40, 20, 10, 5, 16, 8, 4, 2, 1]

Sparsity calculations...

Computed sequence lengths 2168611

Largest term: 56991483520

Test range: 1 1000000

Biggest gap: 4508198208

Sparsity: 0.00175%

# If True, will precompute powers of 2 and multiples of 2

# in practice this made little difference on 64-bit hardware

OPTIMISE = True

def build_sequence(n):

"""return sequence as a list given the starting number

Uses the trail of data left by compute_sequence"""

tmp = compute_sequence(n)

sequence = []

while n:

sequence.append(n)

n = next_num[n]

return sequence

def compute_sequence(n):

"""lazily compute sequences for Collatz problem"""

if n in seqlength:

return seqlength[n]

if n not in next_num:

# NOTE: (some) evens are pre-computed

next_num[n] = 3 * n + 1 if n % 2 else n // 2

seqlength[n] = 1 + compute_sequence(next_num[n])

return seqlength[n]

import time

start = time.time()

highest_number = int(1000000)

highest_term = highest_number * 3 + 1

highest_term += 1 if highest_term % 2 else 0

next_num = {2:1}

if OPTIMISE:

# quickly pre-compute (some of) the evens (used for n = n//2 if n is even)

# how many should we precompute? Any mathematicians?

doubles = range(2, highest_term, 2)

numbers = range(1, highest_term//2)

next_num = dict(zip(doubles, numbers))

# mark 1 as the end-point of any sequence

next_num[1] = 0

# initialise the sequence lengths

seqlength = {}

seqlength[1] = 0

seqlength[2] = 1

if OPTIMISE:

# powers of 2 are trivial: 2**n has sequence length n

n = 2

pwr = 4

while pwr < highest_term:

seqlength[pwr] = n

pwr = pwr * 2

n += 1

max_length = 0

for n in range(3, highest_number + 1):

length = compute_sequence(n)

if length > max_length:

max_length = length

winning_number = n

print ("And the winner is {0} with sequence length {1}".format(winning_number, max_length))

end = time.time()

print ("Time (s): ", (end-start))

print ("Sequence is:")

print (build_sequence(winning_number))

# Sparsity calculation

sorted_seqlengths = sorted(seqlength.keys())

print ("Sparsity calculations...")

print ("Computed sequence lengths", len(seqlength))

largest_term = sorted_seqlengths[-1]

print ("Largest term: ", largest_term)

print ("Test range: ", 1, highest_number)

gaps = (second - first for first, second in zip(sorted_seqlengths[0:-1], sorted_seqlengths[1:]))

biggest_gap = 0

for n in gaps:

if biggest_gap < n:

biggest_gap = n

print ("Biggest gap: ", n)

print ("Sparsity: {0:.5f}%".format(highest_number / largest_term * 100))