more efficient date range comparison

D

Dave Woodworth

Hello all

I'm writing a calendar application that needs to check for conflicts
between recurring entries. Each Entry object has a recurrences() method
which returns an array of ranges - each range contains the start and end
times of each future occurence.

I need to check for conflicts between new and existing entries. I'm
doing this by checking that none of the future occurences of the new
entry clash with future occurences of existing entries:

def conflicts?(other)
conflicts = 0
recurrences.each do |my_rec|
other.recurrences.each do |other_rec|
start, finish = other_rec.first, other_rec.last
conflicts += 1 if my_rec.include?(start) ||
my_rec.include?(finish)
end
end
conflicts > 0
end

recurrences() defaults to returning all occurences between start time
and start time + 1 year

the problem is this method is not the most efficient in the world.
comparing just two entries, each with a daily recurrence over 1 year,
leads to 365 * 365 comparions (on my machine that takes 4+ seconds).
There may be any number of existing entries to compare a new entry to so
the method I have now is useless.

I don't have a computer science or maths background but I've been
reading various textbooks on algorithms and I havn't been able to find a
way to optimize the method. Does anyone else have any ideas?

thanks
Dave
 
L

LAMBEAU Bernard

First of all, you could return true as soon as you find a conflict
instead of counting them. However, it does not change anything from a
theoretical point of view: your method executes in O(n*m) time which
means that if recurrences has n entries and other.recurrences has m
entries, then the time spent in the method is a function of n*m in the
worst case: when no conflict arrises your method checks all pairs.

If I understand your need correctly (I'm not sure to understand what
an occurence is exactly), it seems that you could write an algorithm
that executes in O(m+n), which is really better. By sorting your
ranges, you could avoid many useless comparisons. It reminds me a
couple of algorithms that I've written a few month ago and which
computes the union/intersection of composite intervals. You can have a
look at code.chefbe.net: there's a project called CRUC with a well
documented class named Interval (rdoc is online) which contains these
algorithms.

I hope this helps (let me know)
blambeau
 
A

Albert Schlef

Dave said:
conflicts += 1 if my_rec.include?(start) ||
my_rec.include?(finish)

Off topic: I think you have a bug here. If my_rec is, say, 20..40, and
other_rec is, say, 10..50, your test won't detect this conflict (because
neither 10 nor 50 are included in the my_rec range).
 
L

LAMBEAU Bernard

It's a OR, not a AND, so it seems correct ... no? Seems to be the
classical overlap operator on simple intervals.

blambeau
 
L

LAMBEAU Bernard

Oops, you are perfectly right ... sorry!

blambeau

It's a OR, not a AND, so it seems correct ... no? Seems to be the
classical overlap operator on simple intervals.

blambeau

 
L

LAMBEAU Bernard

Extracted from the CRUC project I mentionned (see the intersection
method). I think the code below should make the job. Even if your
range lists are not initially sorted, this algorithm should be better
than the initial one: O(nlogn+mlogm+n+m) in the worst case.

# Assume you have two lists of ranges, each one being sorted by start point
my_ranges, other_ranges =3D ...

# Take the first two, as well as their extremities
r1, r2 =3D my_ranges.shift, other_ranges.shift
b1, e1 =3D r1.begin, r1.end
b2, e2 =3D r2.begin, r2.end

until (my_ranges.empty? or other_ranges.empty?)
if e1<b2
# my end is before its begin, we don't overlap
r1 =3D my_points.shift
b1, e1 =3D r1.begin, r1.end
elsif
# its end is before my begin, we don't overlap
r2 =3D other_points.shift
b2, e2 =3D r2.begin, r2.end
else
# intersection found, it's a conflict
return true
end
end

# All ranges are disjoint, no conflict
return false



blambeau

Oops, you are perfectly right ... sorry!

blambeau

It's a OR, not a AND, so it seems correct ... no? Seems to be the
classical overlap operator on simple intervals.

blambeau

 
D

Dave Woodworth

Albert said:
Off topic: I think you have a bug here. If my_rec is, say, 20..40, and
other_rec is, say, 10..50, your test won't detect this conflict (because
neither 10 nor 50 are included in the my_rec range).

good catch! thanks
 
D

Dave Woodworth

LAMBEAU said:
Extracted from the CRUC project I mentionned (see the intersection
method). I think the code below should make the job. Even if your
range lists are not initially sorted, this algorithm should be better
than the initial one: O(nlogn+mlogm+n+m) in the worst case.

# Assume you have two lists of ranges, each one being sorted by start
point
my_ranges, other_ranges = ...

# Take the first two, as well as their extremities
r1, r2 = my_ranges.shift, other_ranges.shift
b1, e1 = r1.begin, r1.end
b2, e2 = r2.begin, r2.end

until (my_ranges.empty? or other_ranges.empty?)
if e1<b2
# my end is before its begin, we don't overlap
r1 = my_points.shift
b1, e1 = r1.begin, r1.end
elsif
# its end is before my begin, we don't overlap
r2 = other_points.shift
b2, e2 = r2.begin, r2.end
else
# intersection found, it's a conflict
return true
end
end

# All ranges are disjoint, no conflict
return false



blambeau

thanks for your suggestions. I'm using a slightly modified version of
the code you posted and it's about 8x faster. I'm sure it could be
optimized further but it's good enough for now. thanks again!

dave
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top