Need efficient search strategy in list of time intervals

L

lemmi

Hi,

I was wondering if someone of you could help me out with an
algorithmic problem I have. This is not a homework exercise but
something I need for a UI framework that I have developed
(www.dlsc.com).

I have a list of time intervals ([long,long]). The list is sorted
based on the start times of the intervals. I now need a fast
algorithm, which takes a time span TS as input and returns the index
of the first time interval A in the list that is contained in the
given time span. The rendering engine of my UI framework will then
iterate over the list starting at the calculated index.

A time interval A is "contained" when A does not end before TS begins
and also does not start after TS ends (!(A.end < TS.start || A.start >
TS.end).

The algorithm gets constantly called as part of a paint() method so it
needs to be extremely fast (linear is not good enough, logarithmic
would be ideal).

To make things complicated. The intervals can overlap each other. An
interval A that starts earlier than B might very well end after B.
A.start < B.start && A.end > B.end.

Note: a standard binary search does not work because it might find a
time interval that ends before the input time span TS and then
interrupts its search even though another time interval is located
before it that ends after it.

Dirk
 
R

Roedy Green

I have a list of time intervals ([long,long]). The list is sorted
based on the start times of the intervals. I now need a fast
algorithm, which takes a time span TS as input and returns the index
of the first time interval A in the list that is contained in the
given time span. The rendering engine of my UI framework will then
iterate over the list starting at the calculated index.

That sounds like a job for a binary search.
See http://mindprod.com/jgloss/binarysearch.html
 
L

lemmi

I have a list of time intervals ([long,long]). The list is sorted
based on the start times of the intervals. I now need a fast
algorithm, which takes a time span TS as input and returns the index
of the first time interval A in the list that is contained in the
given time span. The rendering engine of my UI framework will then
iterate over the list starting at the calculated index.

That sounds like a job for a binary search.
Seehttp://mindprod.com/jgloss/binarysearch.html

Binary search fails because of the fact that the time intervals can
overlap each other.
 
L

lemmi

I have created an SSCCE for this problem:
------------------------------------------------------------------


import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

/*
* A simple short self-contained compilable example (SSCCE), which
illustrates
* the time span search problem.
*
* @author Dirk Lemmermann
*/
public class TimeSpanProblem {
private List<TimeSpan> list = new ArrayList<TimeSpan>();

private TimeSpanProblem() {
// One huge time span that contains the other three spans.
list.add(new TimeSpan(0, 70));

// Three contained time spans.
list.add(new TimeSpan(10, 20));
list.add(new TimeSpan(30, 40));
list.add(new TimeSpan(50, 60));

// Sort all spans based on their start time.
//
// actually not needed because we added the time spans in their
// correct order.
Collections.sort(list);

/*
* find the first visible time span contained in interval 25, 80.
The
* first one should be the huge time span 0, 70 but the binary
search
* stops looking 'to the left' because interval 10, 20 is
*/
int first = getFirstContainedSpan(new TimeSpan(25, 80));
System.out.println("first contained time span = " + first);

if (first != 0) {
System.err.println("Time span problem not solved!");
System.err.println("The huge time span that overlaps the");
System.err.println("other three spans was not found.");
}
}

public int getFirstContainedSpan(TimeSpan span) {
/*
* These are the javadocs of the binarySearch() method:
*
* Returns: the index of the search key, if it is contained in the
list;
* otherwise, (-(insertion point) - 1). The insertion point is
defined
* as the point at which the key would be inserted into the list:
the
* index of the first element greater than the key, or list.size()
if
* all elements in the list are less than the specified key. Note
that
* this guarantees that the return value will be >= 0 if and only if
the
* key is found.
*/
int result = Collections.binarySearch(list, span);
return -(result + 1);
}

class TimeSpan implements Comparable<TimeSpan> {
long start;

long end;

public TimeSpan(long start, long end) {
this.start = start;
this.end = end;
}

public int compareTo(TimeSpan otherSpan) {
long delta = otherSpan.start - start;
if (delta > 0) {
return -1;
} else if (delta < 0) {
return +1;
}
return 0;
}
}

public static void main(String[] args) {
new TimeSpanProblem();
}
}


--Dirk
 
L

Lasse Reichstein Nielsen

lemmi said:
I have a list of time intervals ([long,long]). The list is sorted
based on the start times of the intervals. I now need a fast
algorithm, which takes a time span TS as input and returns the index
of the first time interval A in the list that is contained in the
given time span. The rendering engine of my UI framework will then
iterate over the list starting at the calculated index.

Why iterate? The next element can't be assumed to be hit by TS, so you
might have to iterate over a long list of irrelevant objects anyway.
A time interval A is "contained" when A does not end before TS begins
and also does not start after TS ends (!(A.end < TS.start || A.start >
TS.end).

This is equivalent to (A.end >= TS.start && A.start <= TS.end).
I expect that A.start <= A.end (and likewise for TS) can be assumed.

That sounds like "overlaps", not "contained in". Example;
TS = [2,4]
A = [1,3]
satisfies the requirement.
The algorithm gets constantly called as part of a paint() method so it
needs to be extremely fast (linear is not good enough, logarithmic
would be ideal).

Sounds unlikely, at least in the worst case, since the end-points are
completely unordered. Worst case, assume a (long) list with same start
time, and only one interval has a large enough end time to overlap the
TS.

You should consider whether there is a better data structure than a list
for your purpose, to account for both start and end time.
I can't think of a one, but why think when you can google (for "time interval
data structure"). It seems an Interval Tree is what you need:
<URL:http://en.wikipedia.org/wiki/Interval_tree>.
Found this too:
<URL:http://i11www.iti.uni-karlsruhe.de/~awolff/map-labeling/design/old/interfaces/gen_interval_tree.ps>

Good luck
/L
 
L

lemmi

lemmi said:
I have a list of time intervals ([long,long]). The list is sorted
based on the start times of the intervals. I now need a fast
algorithm, which takes a time span TS as input and returns the index
of the first time interval A in the list that is contained in the
given time span. The rendering engine of my UI framework will then
iterate over the list starting at the calculated index.

Why iterate? The next element can't be assumed to be hit by TS, so you
might have to iterate over a long list of irrelevant objects anyway.

The iteration is smart enough to stop when it reaches a time span with
a start time larger than TS. So if a list contains 100 entries and the
index is (for example 33) then the paint method iterates over 33, 34,
35
and might stop at 36 if its start time is larger than TS.end.
 
R

Roedy Green

Note: a standard binary search does not work because it might find a
time interval that ends before the input time span TS and then
interrupts its search even though another time interval is located
before it that ends after it.

Predigest your list so that you decide which band number you want for
any point on the line. You can create dummy bands to handle areas not
covered. Then you need do your binary search only on the start points
of each band. You look for a band start point <= to the search point.
Have a look inside com.mindprod.jdisplay.PrettyCanvas. It does this.
http://mindprod.com/products.html#JDISPLAY
 
L

lemmi

Fair warning: Somebody aparently patented that idea.
/L

I have found quite a few references to the interval binary
tree algorithm and it seems like a basic computer science
concept. How can this be patented? Can you point me to the
patent?

Dirk
 
P

Patricia Shanahan

lemmi said:
Hi,

I was wondering if someone of you could help me out with an
algorithmic problem I have. This is not a homework exercise but
something I need for a UI framework that I have developed
(www.dlsc.com).

I have a list of time intervals ([long,long]). The list is sorted
based on the start times of the intervals. I now need a fast
algorithm, which takes a time span TS as input and returns the index
of the first time interval A in the list that is contained in the
given time span. The rendering engine of my UI framework will then
iterate over the list starting at the calculated index.

A time interval A is "contained" when A does not end before TS begins
and also does not start after TS ends (!(A.end < TS.start || A.start >
TS.end).

The algorithm gets constantly called as part of a paint() method so it
needs to be extremely fast (linear is not good enough, logarithmic
would be ideal).

To make things complicated. The intervals can overlap each other. An
interval A that starts earlier than B might very well end after B.
A.start < B.start && A.end > B.end.

Note: a standard binary search does not work because it might find a
time interval that ends before the input time span TS and then
interrupts its search even though another time interval is located
before it that ends after it.

Dirk

Does the start time of the probe interval ever reduce? If not, consider
the following:

Keep the intervals in a doubly linked list, such as
java.util.LinkedList, sorted in ascending order of start time.

For each search, scan the list from the start.

If the start time is greater than the probe end time, stop. You have
found all the overlaps for this probe.

Else if the end time is less than the probe start time, delete the interval.

Any interval that passes both those tests overlaps the probe. Each
interval is operated on at most once after it becomes irrelevant, and
otherwise you only look at one non-hit.

The harder problem is if the interface start time can reduce. If so,
perhaps you should think of your intervals as points in a two
dimensional space, indexed by start and end time. The probe interval is
a rectangular region of that space. Look at R-tree, KD-tree etc. to see
if there is something that helps.

Patricia
 
D

Daniel Pitts

Hi,

I was wondering if someone of you could help me out with an
algorithmic problem I have. This is not a homework exercise but
something I need for a UI framework that I have developed
(www.dlsc.com).

I have a list of time intervals ([long,long]). The list is sorted
based on the start times of the intervals. I now need a fast
algorithm, which takes a time span TS as input and returns the index
of the first time interval A in the list that is contained in the
given time span. The rendering engine of my UI framework will then
iterate over the list starting at the calculated index.

A time interval A is "contained" when A does not end before TS begins
and also does not start after TS ends (!(A.end < TS.start || A.start >
TS.end).

The algorithm gets constantly called as part of a paint() method so it
needs to be extremely fast (linear is not good enough, logarithmic
would be ideal).

To make things complicated. The intervals can overlap each other. An
interval A that starts earlier than B might very well end after B.
A.start < B.start && A.end > B.end.

Note: a standard binary search does not work because it might find a
time interval that ends before the input time span TS and then
interrupts its search even though another time interval is located
before it that ends after it.

Dirk

I've had to do this myself recently. What I did was created a class
that represented an Instant (either start or end), and a collection of
all spans that were in the instant (it took some time to build this
index for large datasets), the collection of Instants and be sorted,
and the you can do a binary search for the Instant. This worked
pretty well, but it did take some extra time to build the indexes.

I had a few other ideas that I haven't tried yet. One was to use
Lucene to build an index of my data.

The other was to keep track of the maximum interval size, sort the
intervals by start, binary search for a value, and then move backward
until the start + maxInterval < searchStart; This approach is
dependent on the data a little more, one long interval can cause a big
slow down.

Depending on your dataset size, it might actually be fast enough just
to do a linear search.
 
S

Sideswipe

Dirk,

I recently discussed the same thing in another post -- mapping ranged
values to 1 value. You can do it, and I will cross link you, but there
are caveats dealing with "comparable". The implementation I provided
in the groups is safe and works fine but it has limits (mainly because
I haven't worked past them).

Here is the link to the other post
http://groups.google.com/group/comp...db72482546b/edbf219682de8458#edbf219682de8458

It is, as others have suggested, based on a b-tree.

Christian Bongiorno
http://christian.bongiorno.org
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,062
Latest member
OrderKetozenseACV

Latest Threads

Top