Struggling for inspiration with lists

Denis McMahon · Dec 17, 2013

Hi

I have a list of data that presents as:

timestamp: value

Timestamps are used solely to determine the sequence of items in the list.

I want to find the longest repeated sequence of values in the list.
Example, in the following list:

data = { 0: "d", 1: "x", 2: "y", 3: "t", 4: "d", 5: "y", 77: "g"' 78:
"h", 79: "x", 80: "y", 206: "t", 210: "d", 211: "x" }

I would pull out the sequence x-y-t-d (starting at 1 and 79)

I need to keep the timestamp / data association because I need to
generate output that identifies (a) the longest repeated sequence (b) how
many elements in the longest repeated sequence (c) at what timestamps
each occurrence started.

I'm not sure of the best way, programatically, to aproach this task,
which means I'm unsure whether eg a list of tuples ( time, data ) or an
OrderedDict keyed on the timestamp is the best starting point.

I can make a list of tuples using:

d = [ (k,v) for k,v in data ]

and with the list of tuples, I can do something like:

d.sort( key=lambda tup: tup[0] )

max_start_a = 0
max_start_b = 0
max_len = 0
i = 0

while i < len( d ):

j = i + 1

while j < len( d ):

o = 0

while j+o < len( d ) and d[i+o][1] == d[j+o][1]:

o += 1

if o > max_len:

max_len = 0
max_start_a = i
max_start_b = j

j += 1

i += 1

print d[max_start_a][0], d[max_start_b][0], max_len

Is there a better way to do this?

Tim Chase · Dec 17, 2013

I need to keep the timestamp / data association because I need to
generate output that identifies (a) the longest repeated sequence
(b) how many elements in the longest repeated sequence (c) at what
timestamps each occurrence started.

I'm not sure of the best way, programatically, to aproach this
task, which means I'm unsure whether eg a list of tuples ( time,
data ) or an OrderedDict keyed on the timestamp is the best
starting point. [snip]
Is there a better way to do this?

You might have to define your criteria for "best". Things that
might play into it:

- what happens if there's more than one "longest" sequence?

- can you specify a minimum length that you don't have to track? (a
maximum sequence of length-1 might be uninteresting)

- how large can your input data grow to be? If it's huge, you're
looking at searching a problem space something like O(N^2) or
possibly O(N log N) if you're lucky. Which in turn can involve
lots of memory consumption.

- do you just want to know what that sequence is, or do you want to
know all the locations/offsets where it the longest sequence was
found?

I've played with two simple versions, the first is naive, while the
second tries to use a little bit of smarts to eliminate possibilities
when you know you have at least one match of a certain length. The
result of the function does allow for finding a plurality of longest
sequences, but my demo code just snags one with the max() function.
I leave the "find multiple maxes" as an exercise for the reader.

-tkc

def build1(d):
sequences = {} # sequence -> [list of start offsets]
for i in range(len(d)):
for j in range(i+1, len(d)):
key = tuple(value for timestamp, value in d[i:j])
sequences.setdefault(key, []).append(i)
return sequences

def build2(d):
sequences = {} # sequence -> [list of start offsets]
longest = 1
for i in range(len(d)):
for j in range(i+longest, len(d)):
key = tuple(value for timestamp, value in d[i:j])
if key in sequences:
sequences[key].append(i)
if len(sequences[key]) > longest:
longest = len(sequences[key])
print longest
else:
sequences[key] =
return sequences

data = {
0: "d", 1: "x", 2: "y", 3: "t", 4: "d", 5: "y", 77: "g",
78: "h", 79: "x", 80: "y", 206: "t", 210: "d", 211: "x",
}
d = data.items()
d.sort()

for fn in (build1, build2):
print fn.__name__
sequences = fn(d)
longest_key, offsets = max((
pair
for pair in sequences.iteritems()
if len(pair[1]) > 1
), key=lambda (k,v): len(k))
print "Longest sequence %r found at %s" % (
longest_key,
[d[0] for i in offsets],
)
for k,v in sequences.iteritems():
if len(v) > 1:
print "%s: %r" % (k,v)
print '=' * 50

..

S N log log N sort	6	Oct 5, 2025
struggling with these problems	2	Jan 29, 2013
Need help with this Python code.	2	Jun 13, 2023
Trouble with prediction code, for the life of me I can't figure out why it isnt running properly. Help would be appreciated.	0	Jul 8, 2023
Help for my project in the last minute	0	Apr 23, 2022
Secure Keyboard v2.0 Modern C++ Virtual Keyboard for Windows (Glassmorphism UI, Clipboard Auto-Clear)	0	Mar 26, 2026
Could you fix my code please? (i get no output after inputing the number)	1	Oct 16, 2023
Need help in debugging tic tac toe (Beginner)	0	Jun 28, 2023

Struggling for inspiration with lists

Denis McMahon

Tim Chase

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads