Need help detecting overlapping ranges

  • Thread starter Bryan Richardson
  • Start date
B

Bryan Richardson

Hello all,

I am writing some code where I create a bunch of ranges, then at the end
I want to create new ranges out of any ranges that overlap one another.
For example, say I have the following ranges:

(1..5) (7..11) (22..29) (5..8)

Given the ranges above, I want to end up with the following ranges:

(1..11) (22..29)

Here is the code I've come up with so far (ranges is an array of ranges
similar to what I described above in my example):

ranges = @failed.outages
changes = true
while changes
changes = false
outages = ranges.collect { |range| range.to_a }
ranges.clear
while !outages.empty?
outage = outages.shift
outages.each do |n|
unless (outage & n).empty?
outage = (outage + n).uniq.sort
outages.delete(n)
changes = true
end
end
ranges << (outage.first..outage.last)
end
end
return ranges.sort { |a,b| a.first <=> b.first }

This code works, but it is *EXTREMELY* slow (my current array of ranges
is averaging out to ~24000 range elements). Anyone have an idea of how
to speed it up?
 
S

Siep Korteling

Bryan said:
Hello all,

I am writing some code where I create a bunch of ranges, then at the end
I want to create new ranges out of any ranges that overlap one another.
For example, say I have the following ranges:

(1..5) (7..11) (22..29) (5..8)

Given the ranges above, I want to end up with the following ranges:

(1..11) (22..29)
(...)
This code works, but it is *EXTREMELY* slow (my current array of ranges
is averaging out to ~24000 range elements). Anyone have an idea of how
to speed it up?

Sets are usually much faster then large arrays and sets have a divide
method.

require 'set'
ranges=[(1..5),(7..11),(22..29),(5..8)]
set=Set.new
ranges.each do|range|
range.each do|el|
set<<el
end
end
sets = set.divide { |i,j| (i - j).abs == 1 } #straight from the
documentation!

p sets
#<Set: {#<Set: {27, 22, 28, 23, 29, 24, 25, 26}>, #<Set: {5, 11, 6, 1,
7, 2, 8, 3, 9, 4, 10}>}>

hth,

Siep
 
S

Stefan Lang

2008/8/6 Bryan Richardson said:
Hello all,

I am writing some code where I create a bunch of ranges, then at the end
I want to create new ranges out of any ranges that overlap one another.
For example, say I have the following ranges:

(1..5) (7..11) (22..29) (5..8)

Given the ranges above, I want to end up with the following ranges:

(1..11) (22..29)

Here is the code I've come up with so far (ranges is an array of ranges
similar to what I described above in my example):

ranges = @failed.outages
changes = true
while changes
changes = false
outages = ranges.collect { |range| range.to_a }
ranges.clear
while !outages.empty?
outage = outages.shift
outages.each do |n|
unless (outage & n).empty?
outage = (outage + n).uniq.sort
outages.delete(n)
changes = true
end
end
ranges << (outage.first..outage.last)
end
end
return ranges.sort { |a,b| a.first <=> b.first }

This code works, but it is *EXTREMELY* slow (my current array of ranges
is averaging out to ~24000 range elements). Anyone have an idea of how
to speed it up?

I guess it's slow because it's doing many Range to Array conversions.
Here's a version that does no such conversions:

# Assumption: for every range: range.end >= range.begin
def overlap(ranges)
ranges = ranges.dup
new_ranges = []
until ranges.empty?
cur = ranges.pop
overlap = ranges.find { |range|
cur.member?(range.begin) || cur.member?(range.end) ||
range.member?(cur.begin) || range.member?(cur.end)
}
if overlap
new_begin = cur.begin < overlap.begin ? cur.begin : overlap.begin
new_end = cur.end > overlap.end ? cur.end : overlap.end
join = Range.new(new_begin, new_end)
ranges.map! { |range| range.equal?(overlap) ? join : range }
else
new_ranges << cur
end
end
new_ranges
end

(And in case Gmail messes up the formating:
http://pastecode.com/?show=m69851851)

So I've tried it only with the one example you've given and
I didn't actually benchmark it.

Stefan
 
A

ara.t.howard

Hello all,

I am writing some code where I create a bunch of ranges, then at the
end
I want to create new ranges out of any ranges that overlap one
another.
For example, say I have the following ranges:

(1..5) (7..11) (22..29) (5..8)

Given the ranges above, I want to end up with the following ranges:

(1..11) (22..29)

Here is the code I've come up with so far (ranges is an array of
ranges
similar to what I described above in my example):

ranges = @failed.outages
changes = true
while changes
changes = false
outages = ranges.collect { |range| range.to_a }
ranges.clear
while !outages.empty?
outage = outages.shift
outages.each do |n|
unless (outage & n).empty?
outage = (outage + n).uniq.sort
outages.delete(n)
changes = true
end
end
ranges << (outage.first..outage.last)
end
end
return ranges.sort { |a,b| a.first <=> b.first }

This code works, but it is *EXTREMELY* slow (my current array of
ranges
is averaging out to ~24000 range elements). Anyone have an idea of
how
to speed it up?

this should be quite fast with a few constraints. the main thing is
that you do not have to convert ranges to detect the overlap nor to do
the merge.




cfp:~ > cat a.rb
def collapse_inclusive_integer *ranges
ranges.flatten!
ranges.compact!

ranges.each do |r|
raise ArgumentError, "exclusive range #{ range.inspect }" if
r.exclude_end?
raise ArgumentError, "non-integer range #{ range.inspect }" unless
Integer === r.begin and Integer === r.end
end

overlaps = lambda do |i,j|
a, b = ranges, ranges[j]
a.begin <= b.end and b.begin <= a.end
end

merge = lambda do |i,j|
a, b = ranges, ranges[j]
values = a.begin, a.end, b.begin, b.end
min, max = values.min, values.max
range = min .. max
src, dst = i < j ? [i,j] : [j,i]
ranges[src] = range
ranges.delete_at dst
range
end

loop {
catch('start over'){
size = ranges.size
size.times do |i|
size.times do |j|
next if i == j
if overlaps[i,j]
merge[i,j]
throw 'start over'
end
end
end
return ranges
}
}
end

ranges = 1..5, 7..11, 22..29, 5..8

p ranges
p collapse_inclusive_integer(ranges)





cfp:~ > ruby a.rb
[1..5, 7..11, 22..29, 5..8]
[1..11, 22..29]


a @ http://codeforpeople.com/
 
M

Martin DeMello

Hello all,

I am writing some code where I create a bunch of ranges, then at the end
I want to create new ranges out of any ranges that overlap one another.
For example, say I have the following ranges:

(1..5) (7..11) (22..29) (5..8)

Given the ranges above, I want to end up with the following ranges:

(1..11) (22..29)

Bit busy at work, so I don't have time to play with code, but mostly
your algorithm is slow. This one should work better:

1. sort the array of ranges by startpoint
2. compare the first two ranges. if they overlap, merge them by
setting the second range equal to the merged range and the first to
nil
3. repeat with the next pair, and so on
4. compact the array

in action:

1..5, 5..8, 7..11, 22..29
nil, 1..8, 7..11, 22..29
nil, nil, 1..11, 22..29
no change
1..11, 22..29

Might be some corner cases I haven't taken care of but the basic
algorithm is O(n) and involves no range conversions.

martin
 
B

Bryan Richardson

Thanks to all who replied! I took the main advice of all of your posts
to be "don't convert ranges to arrays stupid!!!" and also "don't do this
recursively stupid!". :)

With that in mind, here's what I came up with (I didn't want to just
copy someone's code... I don't learn anything that way!) -- it seems to
be much faster:

def merge_outages
ranges = @failed.outages.sort { |a,b| a.first <=> b.first }
outages = Array.new
while !ranges.empty?
range = ranges.shift
loop do
if ranges.empty?
break
else
if (range.last + 1) >= ranges.first.first
range = (range.first..ranges.first.last)
ranges.shift
else
break
end
end
end
outages << range
end
return outages
end

What do you guys think of this approach? Am I overlooking something
that you guys suggested?
 
D

David A. Black

Hi --

Thanks to all who replied! I took the main advice of all of your posts
to be "don't convert ranges to arrays stupid!!!" and also "don't do this
recursively stupid!". :)

With that in mind, here's what I came up with (I didn't want to just
copy someone's code... I don't learn anything that way!) -- it seems to
be much faster:

def merge_outages
ranges = @failed.outages.sort { |a,b| a.first <=> b.first }
outages = Array.new
while !ranges.empty?
range = ranges.shift
loop do
if ranges.empty?
break
else
if (range.last + 1) >= ranges.first.first
range = (range.first..ranges.first.last)
ranges.shift
else
break
end
end
end
outages << range
end
return outages
end

I did something similar in trying to implement Martin's algorithm. I
haven't tested it beyond eyeballing the results for this one run:

ranges = [(1..5), (7..11), (22..29), (5..8)].sort_by {|r| r.first }
outages = [ranges.shift]

ranges.each do |r|
if outages[-1].include?(r.first)
outages[-1] = Range.new(outages[-1].first, r.last)
else
outages.push(r)
end
end

p outages


David
 
S

Stefan Lang

2008/8/6 Martin DeMello said:
Bit busy at work, so I don't have time to play with code, but mostly
your algorithm is slow. This one should work better:

1. sort the array of ranges by startpoint
2. compare the first two ranges. if they overlap, merge them by
setting the second range equal to the merged range and the first to
nil
3. repeat with the next pair, and so on
4. compact the array

in action:

1..5, 5..8, 7..11, 22..29
nil, 1..8, 7..11, 22..29
nil, nil, 1..11, 22..29
no change
1..11, 22..29

Might be some corner cases I haven't taken care of but the basic
algorithm is O(n) and involves no range conversions.

It's O(n), but only if you write a radix sort to sort the ranges ;-)

Stefan
 
B

Bryan Richardson

Hi David,

I like the conciseness of your code, but I think it only works for when
the end of one range is the same as the beginning of another. I need to
also handle the times when ranges overlap more than that (i.e. 1..5,
3..6 -> 1..6). Thanks for thinking about this!

--
Thanks!
Bryan
Hi --

outages = Array.new
break
end
end
end
outages << range
end
return outages
end

I did something similar in trying to implement Martin's algorithm. I
haven't tested it beyond eyeballing the results for this one run:

ranges = [(1..5), (7..11), (22..29), (5..8)].sort_by {|r| r.first }
outages = [ranges.shift]

ranges.each do |r|
if outages[-1].include?(r.first)
outages[-1] = Range.new(outages[-1].first, r.last)
else
outages.push(r)
end
end

p outages


David
 
B

Bryan Richardson

It actually looks like the latest code I posted doesn't seem to work
either... I'm getting gaps where I shouldn't be. :(
 
A

ara.t.howard

Can you elaborate at all?


cfp:~ > ruby -e' p ( 0 ... 1 ).to_a '
[0]


cfp:~ > ruby -e' p ( 0 .. 1 ).to_a '
[0, 1]


one range included the endpoint, one does not. with integer ranges
you could guess the prior end point, but not with floats

cfp:~ > ruby -e' p ( 0 ... 1.0 ).to_a '
[0]


cfp:~ > ruby -e' p ( 0 ... 1.0 ).include?(0.999999999999999) '
true


a @ http://codeforpeople.com/
 
B

Bryan Richardson

Actually, never mind what I said about this code in a previous post... I
misunderstood it. I think it works perfectly... still testing!!! :)
Hi --

I did something similar in trying to implement Martin's algorithm. I
haven't tested it beyond eyeballing the results for this one run:

ranges = [(1..5), (7..11), (22..29), (5..8)].sort_by {|r| r.first }
outages = [ranges.shift]

ranges.each do |r|
if outages[-1].include?(r.first)
outages[-1] = Range.new(outages[-1].first, r.last)
else
outages.push(r)
end
end

p outages


David
 
D

David A. Black

Hi --

Hi David,

I like the conciseness of your code, but I think it only works for when
the end of one range is the same as the beginning of another. I need to
also handle the times when ranges overlap more than that (i.e. 1..5,
3..6 -> 1..6). Thanks for thinking about this!

Here's the heavy artillery :)

def merge_ranges(ranges)
ranges = ranges.sort_by {|r| r.first }
*outages = ranges.shift
ranges.each do |r|
lastr = outages[-1]
if lastr.last >= r.first - 1
outages[-1] = lastr.first..[r.last, lastr.last].max
else
outages.push(r)
end
end
outages
end

require 'test/unit'

class RangeTest < Test::Unit::TestCase
def test_one
ranges = [1..5, 20..20, 4..11, 40..45, 39..50]
assert_equal([1..11, 20..20, 39..50], merge_ranges(ranges))
end

def test_two
ranges = [1..5, 7..11, 22..29, 5..8]
assert_equal([1..11, 22..29], merge_ranges(ranges))
end

def test_three
ranges = [1..5, 7..11, 22..29, 6..8]
assert_equal([1..11, 22..29], merge_ranges(ranges))
end
end


David
 
B

Bryan Richardson

Crazy! I had *just* discovered the r.last lastr.last max problem when
you posted this. Thanks! :)
Here's the heavy artillery :)

def merge_ranges(ranges)
ranges = ranges.sort_by {|r| r.first }
*outages = ranges.shift
ranges.each do |r|
lastr = outages[-1]
if lastr.last >= r.first - 1
outages[-1] = lastr.first..[r.last, lastr.last].max
else
outages.push(r)
end
end
outages
end
 
V

vld

Hello all,

I am writing some code where I create a bunch of ranges, then at the end
I want to create new ranges out of any ranges that overlap one another.
For example, say I have the following ranges:

(1..5) (7..11) (22..29) (5..8)

Given the ranges above, I want to end up with the following ranges:

(1..11) (22..29)

Here is a module from our project earlier this year:

<pre>
# Module Spans and the associated Array "span" method implements an
array of numeric
# intervals with "merge", "complement", "fit", and other convenience
methods
# Copyright (C) 2008 GETCO LLC
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# The full text of the license as of August 06, 2008, is available at
# http://www.gnu.org/licenses/gpl-3.0.txt
#
class Array

# Module +Spans+ and the associated Array span method implements an
array numeric intervals - aka +spans+
module Spans

# Reduces the set of spans by combining the overlapping intervals
together to form larger intervals. Adjacent spans are not merged.
# *IMPORTANT:* The return set is a set of *copies* of objects,
*not the references* to the original objects. The new object have some
of their boundaries changed, but it is random, which objects are
deleted, and which have new boundaries.
#
# >> [ [5,5], [10,20], [3,4], [5,6], [15,25], [3,7] ].spans.merge
# => [[3, 7], [10, 25]]
def merge
result = deep_clone.position_sort!
result.each_index do |idx|

break if idx >= result.size - 1
curr = result[ idx ]
succ = result[ idx + 1 ]

if overlap?( curr, succ )
curr.send("#{@span_left}=", [curr.send(@span_left),
succ.send(@span_left) ].min )
curr.send("#{@span_right}=", [curr.send(@span_right),
succ.send(@span_right)].max )
result.delete_at( idx + 1)
redo
end
end
return result.spans( @span_left, @span_right )
end

# Provides the spans' complement to fill the specified interval.
The complement spans is a simple array of new arrays, responding to
the +first+ and +last+ methods.
#
# >> [ [5,5], [10,20], [3,4], [5,6], [15,25],
[3,7] ].spans.complement(-1,28)
# => [[-1, 3], [7, 10], [25, 28]]
def complement( left, right )
complemented = []
merge.clean!.each do |span|
break if right <= span.send(@span_left)
next if span.send(@span_right) < left
complemented << [ left, span.send(@span_left) ] if left <
span.send(@span_left)
left = span.send(@span_right)
end
complemented << [ left, right ] if left < right
complemented.spans
end

# Finds a first arbitrary span to fit the specified size. Returns
one of the original objects in the provided spans array.
#
# >> [ [5,5], [2,4], [3,7] ].spans.fit( 2 )
# => [2, 4]
def fit( requested_size )
# see in-code comment to fit_all
find do |a|
delta = a.send(@span_right) - a.send(@span_left)
(delta = requested_size) or (delta > requested_size)
end
end

# Finds all spans to fit the specified size. Returns a new, but
filtered, array of the original objects. The array alreasy has +Spans+
mixed in with the original boundary names.
#
# >> [ [5,5], [2,4], [3,7] ].spans.fit_all( 2 )
# => [[2, 4], [3, 7]]
def fit_all( requested_size )
self.select do |a|
( requested_size <= (a.send(@span_right) -
a.send(@span_left)) )
end.spans( @span_left, @span_right )
end

# Removes all empty and invalid spans from the the calling
instance
def clean!() delete_if{ |s| s.send(@span_left) >=
s.send(@span_right) } end

protected

def position_sort() sort { |a,b| position( a, b ) } end
def position_sort!() sort! { |a,b| position( a, b ) } end

def position( a, b )
if a.send(@span_left) < b.send(@span_left)
a.send(@span_right) > b.send(@span_left) ? 0 : -1 # -1 is
when "a" is at left of "b"
else
a.send(@span_left) < b.send(@span_right) ? 0 : 1 # 1 is when
"a" is at right of "b"
end
end

def overlap?( a, b )
position( a, b ) == 0
end

def deep_clone
klone = self.clone
klone.clear
each { |v| klone << v.clone }
klone
end
end

# The setter method for an array's first element
def first=( value) self[0] =value end

# The setter method for an array's last element
def last=( value) self[-1]=value end

# Sets up the calling +Array+ instance to be processed as +Spans+ -
a collection of intervals. If arguments provided those should be names
of getter/setter methods for left and right sides on intervals
accordingly.
def spans( left_name = nil, right_name = nil )
# Validates, that all spans in the array have properly ordered
ends (beginning before or at the end)
def valid?
each { |span| return false if span.send(@span_left) >
span.send(@span_right) }
return true
end

@span_left = ( left_name || "first" )
@span_right = ( right_name || "last" )
unless valid?
remove_instance_variable:)@span_left)
remove_instance_variable:)@span_right)
raise ArgumentError, "Can not consider this array for spans -
invalid intervals."
end
self.extend Spans
end
end
</pre>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top