algorithm help

Discussion in 'Ruby' started by Ara.T.Howard, Aug 2, 2005.

  1. Ara.T.Howard

    Ara.T.Howard Guest

    any of you knuth fans know some slick (read: fast) way to do this

    a = [ 1, 2, 3, 6, 7, 8 ,9, 42 ]

    a.ranges #=> [ (1..3), (6..9), (42..42) ]

    or am i doomed to O(n)?

    cheers.

    -a
    --
    ===============================================================================
    | email :: ara [dot] t [dot] howard [at] noaa [dot] gov
    | phone :: 303.497.6469
    | My religion is very simple. My religion is kindness.
    | --Tenzin Gyatso
    ===============================================================================
    Ara.T.Howard, Aug 2, 2005
    #1
    1. Advertising

  2. Ara.T.Howard

    Szymon Rozga Guest

    What exactly are you trying to do ?
    -s
    Szymon Rozga, Aug 2, 2005
    #2
    1. Advertising

  3. Ara.T.Howard

    Ara.T.Howard Guest

    On Tue, 2 Aug 2005, Szymon Rozga wrote:

    > What exactly are you trying to do ?
    > -s


    i have a massive (100-500MB) grid of solar elevations and a coresponding data
    set of binary format.

    i need to find the areas in the solar elevations file that are less than a
    certain amount and, based on those ranges, munge the data file in a certain
    way to fix an issue with 'wobbling' sensor aboard the satelite imaging the
    data. basically we have lines that look like

    ------------ | ------------
    ------------ | ------------

    where the '|'s are supposed to line up. as it turns out this is only
    happening in 'dark' areas of the data - therefore the solar elevations can be
    used to predict where the problem will occur within a few pixels. (the
    problem is caused by a photo-multiplier tube switching on when darkness is
    sensed - we think).

    so - i currently am munging the data, which is a terrible record based format,
    using writable mmap's of the file and this is very fast. the issue is
    constraining the munging to certain regions. narray allows me to say this
    sort of thing __very__ quicky:

    solar_elevations = NArray::sint 'solar_elevations.data', samples, lines

    lines.times |j|
    line = solar_elevations[ true, j ]
    dark = line.lt(-3).where
    ...
    ...
    end

    here dark is the array having values like

    [ 1, 2, 3, 6789, 6790, 6791 ]

    that i want to reduce to a list of ranges.

    in reality the ranges are huge and there will, in amost all cases except
    crossing the poles, be only one. also note that 'dark' is a sorted list. my
    current optimization is

    min, max = dark[0], dark[dark.size - 1]

    ranges =
    if((max - min + 1) != dark.size)
    slow_search dark
    else
    [ min .. max ]
    end

    ranges.each do |range|
    #
    # munge data based on ranges which are small and fast
    #
    end


    i can't think of anything better than brute force for the 'slow_search' but
    thought i'd throw it out there...

    cheers.

    -a
    --
    ===============================================================================
    | email :: ara [dot] t [dot] howard [at] noaa [dot] gov
    | phone :: 303.497.6469
    | My religion is very simple. My religion is kindness.
    | --Tenzin Gyatso
    ===============================================================================
    Ara.T.Howard, Aug 2, 2005
    #3
  4. He=B4s trying to identify all the sequences inside the original array.

    On 8/2/05, Szymon Rozga <> wrote:
    > What exactly are you trying to do ?
    > -s
    >=20
    >=20
    >
    Luiz Esmiralha, Aug 2, 2005
    #4
  5. Ara.T.Howard

    Ara.T.Howard Guest

    --8323328-254812897-1122958942=:22412
    Content-Type: MULTIPART/MIXED; BOUNDARY="8323328-254812897-1122958942=:22412"

    This message is in MIME format. The first part should be readable text,
    while the remaining parts are likely unreadable without MIME-aware tools.

    --8323328-254812897-1122958942=:22412
    Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed
    Content-Transfer-Encoding: QUOTED-PRINTABLE

    On Tue, 2 Aug 2005, Luiz Esmiralha wrote:

    > He=B4s trying to identify all the sequences inside the original array.


    based on the assumptions that

    * array is sorted
    * array is huge
    * array probably contains zero or one range
    * real upper limit of ranges is 2 but, in theory, could be n

    which makes it a bit easier...

    cheers.

    -a
    --=20
    =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
    =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
    =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
    =3D=3D=3D=3D
    | email :: ara [dot] t [dot] howard [at] noaa [dot] gov
    | phone :: 303.497.6469
    | My religion is very simple. My religion is kindness.
    | --Tenzin Gyatso
    =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
    =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
    =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
    =3D=3D=3D=3D

    --8323328-254812897-1122958942=:22412--
    --8323328-254812897-1122958942=:22412--
    Ara.T.Howard, Aug 2, 2005
    #5
  6. Ara.T.Howard <> wrote:
    >
    > here dark is the array having values like
    >
    > [ 1, 2, 3, 6789, 6790, 6791 ]
    >
    > that i want to reduce to a list of ranges.
    >
    > in reality the ranges are huge and there will, in amost all cases except
    > crossing the poles, be only one. also note that 'dark' is a sorted list. my
    > current optimization is
    >
    > min, max = dark[0], dark[dark.size - 1]
    >
    > ranges =
    > if((max - min + 1) != dark.size)
    > slow_search dark
    > else
    > [ min .. max ]
    > end
    >
    > ranges.each do |range|
    > #
    > # munge data based on ranges which are small and fast
    > #
    > end
    >
    >
    > i can't think of anything better than brute force for the 'slow_search' but
    > thought i'd throw it out there...


    This might work better: binary search for {x | (x - min) == (i_x - i_min)},
    set i_min to i_x+1 and repeat. For an added optimisation, run two loops
    in parallel, searching from each end.

    martin
    Martin DeMello, Aug 2, 2005
    #6
  7. Instead of looking for ranges, why not look for gaps in the sequence
    as you read in the input. Keep track of where each gap occurs
    (beginning and end), like an inverse range. If you do this as you
    populate your array, you won't have to go back and traverse the array
    afterwards (hence the O(n)).

    Hmmm...you would have thought about that by now if it were _that_ easy :)

    Sorry, Ara, I guess I don't fully understand your problem.

    Dan
    Daniel Amelang, Aug 2, 2005
    #7
  8. Ara.T.Howard wrote:
    > On Tue, 2 Aug 2005, Luiz Esmiralha wrote:
    >=20
    >> He=EF=BF=BDs trying to identify all the sequences inside the original =

    array.
    >=20
    >=20
    > based on the assumptions that
    >=20
    > * array is sorted
    > * array is huge
    > * array probably contains zero or one range
    > * real upper limit of ranges is 2 but, in theory, could be n
    >=20
    > which makes it a bit easier...
    >=20
    > cheers.
    >=20
    > -a


    How could the array contain zero ranges? Empty array?

    If I understand the practical assumptions we can make, then there are a
    bounded number (1,2) of "gaps" between ranges. Why not just binary
    search for them?

    --=20
    vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407
    Joel VanderWerf, Aug 2, 2005
    #8
  9. Daniel Amelang wrote:
    > Instead of looking for ranges, why not look for gaps in the sequence
    > as you read in the input. Keep track of where each gap occurs
    > (beginning and end), like an inverse range. If you do this as you
    > populate your array, you won't have to go back and traverse the array
    > afterwards (hence the O(n)).


    It's already O(n) if you read the array sequentially. I guess we're
    assuming the array is given as a block of memory somewhere (Ara said
    something about mmap...), so if we're clever enough we may not need to
    traverse it even once.

    > Hmmm...you would have thought about that by now if it were _that_ easy :)
    >
    > Sorry, Ara, I guess I don't fully understand your problem.
    >
    > Dan
    >



    --
    vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407
    Joel VanderWerf, Aug 2, 2005
    #9
  10. Joel VanderWerf wrote:
    > Ara.T.Howard wrote:
    > > On Tue, 2 Aug 2005, Luiz Esmiralha wrote:
    > >
    > >> He?s trying to identify all the sequences inside the original array.

    > >
    > >
    > > based on the assumptions that
    > >
    > > * array is sorted
    > > * array is huge
    > > * array probably contains zero or one range
    > > * real upper limit of ranges is 2 but, in theory, could be n
    > >
    > > which makes it a bit easier...
    > >
    > > cheers.
    > >
    > > -a

    >
    > How could the array contain zero ranges? Empty array?
    >
    > If I understand the practical assumptions we can make, then there are a
    > bounded number (1,2) of "gaps" between ranges. Why not just binary
    > search for them?


    [
    [3, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41],
    [3, 4, 5, 6, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41],
    [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 37, 38, 39, 40, 41],
    [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17],
    [3, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42],
    [3, 4, 5, 6, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42],
    [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 37, 38, 39, 40, 41, 42],
    [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 42]
    ].each { |dark|

    min, max = dark[0], dark[-1]
    if((max - min + 1) == dark.size)
    ranges = [ min .. max ]
    else
    dif = min
    lower, upper = 0, dark.size-1
    while upper - lower > 1
    print "."
    i = (lower + upper)/2
    if dark - i == dif
    lower = i
    else
    upper = i
    end
    end
    ranges = [ (dark[0]..dark[lower]), (dark[upper]..dark[-1]) ]
    end

    p ranges
    }

    -----
    ouput:

    ....[3..3, 28..41]
    .....[3..6, 31..41]
    .....[3..12, 37..41]
    [3..17]
    ....[3..3, 28..42]
    .....[3..6, 31..42]
    .....[3..12, 37..42]
    .....[3..17, 42..42]
    William James, Aug 2, 2005
    #10
  11. Ara.T.Howard

    Pit Capitain Guest

    Martin DeMello schrieb:
    > Ara.T.Howard <> wrote:
    >
    >>here dark is the array having values like
    >>
    >> [ 1, 2, 3, 6789, 6790, 6791 ]
    >>
    >>that i want to reduce to a list of ranges.
    >>
    >>in reality the ranges are huge and there will, in amost all cases except
    >>crossing the poles, be only one. also note that 'dark' is a sorted list. my
    >>current optimization is
    >>
    >> min, max = dark[0], dark[dark.size - 1]
    >>
    >> ranges =
    >> if((max - min + 1) != dark.size)
    >> slow_search dark
    >> else
    >> [ min .. max ]
    >> end
    >>
    >> ranges.each do |range|
    >> #
    >> # munge data based on ranges which are small and fast
    >> #
    >> end
    >>
    >>i can't think of anything better than brute force for the 'slow_search' but
    >>thought i'd throw it out there...

    >
    > This might work better: binary search for {x | (x - min) == (i_x - i_min)},
    > set i_min to i_x+1 and repeat. For an added optimisation, run two loops
    > in parallel, searching from each end.


    Just in case you haven't implemented Martin's binary search idea yet:

    def gap?( data, min, max )
    data[ max ] - data[ min ] > max - min
    end

    def find_gaps( data, min, max, acc )
    if gap?( data, min, max )
    if max - min == 1
    acc << max
    else
    mid = ( max + min ) / 2
    find_gaps( data, min, mid, acc )
    find_gaps( data, mid, max, acc )
    end
    end
    end

    def gaps( data )
    acc = [ 0 ]
    find_gaps( data, 0, data.size - 1, acc )
    acc << data.size
    end

    def ranges( *data )
    gaps = gaps( data )
    ( 1 ... gaps.size ).map { |idx|
    data[ gaps[ idx - 1 ] ] .. data[ gaps[ idx ] - 1 ]
    }
    end

    p ranges( 1, 2, 3, 6789, 6790, 6791 )
    # => [1..3, 6789..6791]

    Regards,
    Pit
    Pit Capitain, Aug 2, 2005
    #11
  12. Ara.T.Howard

    Ara.T.Howard Guest

    On Tue, 2 Aug 2005, Pit Capitain wrote:

    >> This might work better: binary search for {x | (x - min) == (i_x - i_min)},
    >> set i_min to i_x+1 and repeat. For an added optimisation, run two loops
    >> in parallel, searching from each end.

    >
    > Just in case you haven't implemented Martin's binary search idea yet:
    >
    > def gap?( data, min, max )
    > data[ max ] - data[ min ] > max - min
    > end
    >
    > def find_gaps( data, min, max, acc )
    > if gap?( data, min, max )
    > if max - min == 1
    > acc << max
    > else
    > mid = ( max + min ) / 2
    > find_gaps( data, min, mid, acc )
    > find_gaps( data, mid, max, acc )
    > end
    > end
    > end
    >
    > def gaps( data )
    > acc = [ 0 ]
    > find_gaps( data, 0, data.size - 1, acc )
    > acc << data.size
    > end
    >
    > def ranges( *data )
    > gaps = gaps( data )
    > ( 1 ... gaps.size ).map { |idx|
    > data[ gaps[ idx - 1 ] ] .. data[ gaps[ idx ] - 1 ]
    > }
    > end
    >
    > p ranges( 1, 2, 3, 6789, 6790, 6791 )
    > # => [1..3, 6789..6791]


    sorry it took me so long to respond pit and martin... got sidetracked. your
    binary search idea was perfect! for the actual data i'll be working with this
    should be either O(1) or O(log(n)) - thought it can obviously degrade in the
    general case. i used your idea in essentially the reverse pit, here's what i
    ended up with:


    harp:~ > cat a.rb

    def sequences list
    distance = lambda do |a,b|
    (b - a).abs
    end

    continuous = lambda do |range, list|
    first, last = range.first, range.last
    a, b = list[first], list[last]
    (list.empty? or (distance[a,b] == distance[first,last])) ? (a .. b) : nil
    end

    discrete = lambda do |range, list|
    first, last = range.first, range.last
    edge = last + 1
    edge >= list.length or distance[list[last], list[edge]] > 1
    end

    sequence = lambda do |range, list|
    first, last = range.first, range.last
    list[first] .. list[last]
    end

    last = list.length - 1
    a, b = 0, last

    accum = []

    while a <= b and a < list.length and b < list.length
    range = a .. b

    if continuous[ range, list ]
    if discrete[ range, list ]
    accum << sequence[ range, list ]
    a, b = b + 1, last # move a and b up
    else
    b = b + distance[b, last] / 2 # move b up
    end
    else
    b = a + distance[a, b] / 2 # move b down
    end
    end

    accum
    end

    tests = [
    [],
    [0],
    [0,1],
    [0,1,2],
    [0,1,2,3],
    [0,3],
    [0,3,5],
    [0,3,5,7],
    [0,1,3,4],
    [0,1,3,4,6,7],
    [0,1,3,4,6,7,9,10],
    [0,1,3,4,6,7,10],
    [0,3,4,6,7,9,10],
    [0,1,3],
    [0,3,4],
    [0,-1],
    [0,-1,-2],
    [0,-1,-3,-4],
    [0,-1,-3,-4,-6],
    [0,-1,-3,-4,-6,-7],
    [-1,-3,-4,-6,-7],
    [-3,-4],
    [-3,-4,-6, *(-42 .. -8).to_a.reverse],
    ]

    tests.each do |a|
    printf "%-37.37s => %37.37s\n", a.inspect, sequences(a).inspect
    a.reverse!
    printf "%-37.37s => %37.37s\n", a.inspect, sequences(a).inspect
    end



    harp:~ > ruby a.rb
    [] => []
    [] => []
    [0] => [0..0]
    [0] => [0..0]
    [0, 1] => [0..1]
    [1, 0] => [1..0]
    [0, 1, 2] => [0..2]
    [2, 1, 0] => [2..0]
    [0, 1, 2, 3] => [0..3]
    [3, 2, 1, 0] => [3..0]
    [0, 3] => [0..0, 3..3]
    [3, 0] => [3..3, 0..0]
    [0, 3, 5] => [0..0, 3..3, 5..5]
    [5, 3, 0] => [5..5, 3..3, 0..0]
    [0, 3, 5, 7] => [0..0, 3..3, 5..5, 7..7]
    [7, 5, 3, 0] => [7..7, 5..5, 3..3, 0..0]
    [0, 1, 3, 4] => [0..1, 3..4]
    [4, 3, 1, 0] => [4..3, 1..0]
    [0, 1, 3, 4, 6, 7] => [0..1, 3..4, 6..7]
    [7, 6, 4, 3, 1, 0] => [7..6, 4..3, 1..0]
    [0, 1, 3, 4, 6, 7, 9, 10] => [0..1, 3..4, 6..7, 9..10]
    [10, 9, 7, 6, 4, 3, 1, 0] => [10..9, 7..6, 4..3, 1..0]
    [0, 1, 3, 4, 6, 7, 10] => [0..1, 3..4, 6..7, 10..10]
    [10, 7, 6, 4, 3, 1, 0] => [10..10, 7..6, 4..3, 1..0]
    [0, 3, 4, 6, 7, 9, 10] => [0..0, 3..4, 6..7, 9..10]
    [10, 9, 7, 6, 4, 3, 0] => [10..9, 7..6, 4..3, 0..0]
    [0, 1, 3] => [0..1, 3..3]
    [3, 1, 0] => [3..3, 1..0]
    [0, 3, 4] => [0..0, 3..4]
    [4, 3, 0] => [4..3, 0..0]
    [0, -1] => [0..-1]
    [-1, 0] => [-1..0]
    [0, -1, -2] => [0..-2]
    [-2, -1, 0] => [-2..0]
    [0, -1, -3, -4] => [0..-1, -3..-4]
    [-4, -3, -1, 0] => [-4..-3, -1..0]
    [0, -1, -3, -4, -6] => [0..-1, -3..-4, -6..-6]
    [-6, -4, -3, -1, 0] => [-6..-6, -4..-3, -1..0]
    [0, -1, -3, -4, -6, -7] => [0..-1, -3..-4, -6..-7]
    [-7, -6, -4, -3, -1, 0] => [-7..-6, -4..-3, -1..0]
    [-1, -3, -4, -6, -7] => [-1..-1, -3..-4, -6..-7]
    [-7, -6, -4, -3, -1] => [-7..-6, -4..-3, -1..-1]
    [-3, -4] => [-3..-4]
    [-4, -3] => [-4..-3]
    [-3, -4, -6, -8, -9, -10, -11, -12, - => [-3..-4, -6..-6, -8..-42]
    [-42, -41, -40, -39, -38, -37, -36, - => [-42..-8, -6..-6, -4..-3]



    your ideas were essential to getting the peice of code i'm working on now
    running in a reasonable amount of time - thanks to all the contributors!

    -a
    --
    ===============================================================================
    | email :: ara [dot] t [dot] howard [at] noaa [dot] gov
    | phone :: 303.497.6469
    | My religion is very simple. My religion is kindness.
    | --Tenzin Gyatso
    ===============================================================================
    Ara.T.Howard, Aug 3, 2005
    #12
  13. Ara.T.Howard wrote:

    <snip>

    > your ideas were essential to getting the peice of code i'm working on now
    > running in a reasonable amount of time - thanks to all the contributors!
    >
    > -a


    No applause please. Just throw money (at RubyCentral).

    :)

    Regards,

    Dan
    Daniel Berger, Aug 3, 2005
    #13
  14. Ara.T.Howard

    Ara.T.Howard Guest

    On Thu, 4 Aug 2005, Daniel Berger wrote:

    > Ara.T.Howard wrote:
    >
    > <snip>
    >
    >> your ideas were essential to getting the peice of code i'm working on now
    >> running in a reasonable amount of time - thanks to all the contributors!
    >>
    >> -a

    >
    > No applause please. Just throw money (at RubyCentral).


    how? i bought the new book...

    -a
    --
    ===============================================================================
    | email :: ara [dot] t [dot] howard [at] noaa [dot] gov
    | phone :: 303.497.6469
    | My religion is very simple. My religion is kindness.
    | --Tenzin Gyatso
    ===============================================================================
    Ara.T.Howard, Aug 3, 2005
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. senthil
    Replies:
    3
    Views:
    1,866
    prabinthomaskottayil
    Nov 25, 2011
  2. Derek
    Replies:
    12
    Views:
    591
    Mykola Rabchevskiy
    Jul 22, 2003
  3. Jim

    Algorithm help

    Jim, Jul 16, 2003, in forum: Java
    Replies:
    5
    Views:
    423
    Jos A. Horsmeier
    Jul 19, 2003
  4. Ahmed Moustafa
    Replies:
    0
    Views:
    767
    Ahmed Moustafa
    Nov 15, 2003
  5. Bapaiah Katepalli
    Replies:
    1
    Views:
    1,483
    Mike Treseler
    Jun 23, 2006
Loading...

Share This Page