What is the best way to pull out a range of values?

Discussion in 'Perl Misc' started by Alf McLaughlin, Mar 16, 2006.

  1. Hello all-
    It seems I am running across the following problem a lot recently.
    Let's say I have a large collection of values and they are in an array
    (perhaps there are 10,000 of them). I would like to pull out all
    values between $x and $y without having to loop through the entire
    array to figure out where the values I'm interested in are locating.
    Is there a simple way to do this in Perl? For example, in SQL I would
    write something like this:

    SELECT position FROM my_table WHERE position BETWEEN 14000 and 1479;

    Perhaps the answer involves storing the data in a different structure
    (I can picture this working very quickly if the values I'm interested
    in were keys in a hash, but I still don't know how to do it).

    Thanks,
    Alf
     
    Alf McLaughlin, Mar 16, 2006
    #1
    1. Advertising

  2. Alf McLaughlin

    Paul Lalli Guest

    Alf McLaughlin wrote:
    > Hello all-
    > It seems I am running across the following problem a lot recently.
    > Let's say I have a large collection of values and they are in an array
    > (perhaps there are 10,000 of them). I would like to pull out all
    > values between $x and $y without having to loop through the entire
    > array to figure out where the values I'm interested in are locating.
    > Is there a simple way to do this in Perl?


    I see two possible solutions, other than the explicit loop you said you
    want to avoid:

    1) sort them all, and then find the ones you want:
    my @sorted = sort { $a <=> $b } @values;
    my @selected;
    my $i = 0;
    $i++ while $selected[$i] < $x;
    push $selected[$i] until $selected[$i++] > $y;


    2) Use grep, which really is the same as your original solution, just
    hiding the loop:
    my @selected = grep { $_ >= $x and $_ <= $y } @values;

    I'm not at all convinced the first method is in any way "better" than
    the linear search (you can do some Benchmarks if you're curious.).

    > Perhaps the answer involves storing the data in a different structure
    > (I can picture this working very quickly if the values I'm interested
    > in were keys in a hash, but I still don't know how to do it).


    I can't imagine how you would build such a hash without doing a linear
    search through the values to begin with, which negates any possible
    benefit the hash look up would have.

    Paul Lalli
     
    Paul Lalli, Mar 16, 2006
    #2
    1. Advertising

  3. Alf McLaughlin

    DJ Stunks Guest

    Alf McLaughlin wrote:
    > Hello all-
    > It seems I am running across the following problem a lot recently.
    > Let's say I have a large collection of values and they are in an array
    > (perhaps there are 10,000 of them). I would like to pull out all
    > values between $x and $y without having to loop through the entire
    > array to figure out where the values I'm interested in are locating.
    > Is there a simple way to do this in Perl? For example, in SQL I would
    > write something like this:
    >
    > SELECT position FROM my_table WHERE position BETWEEN 14000 and 1479;


    I'm sure a computer scientist out there will have a faster way, but the
    simplest and easiest comprehensible way is:

    my @selected_nums = grep { $x <= $_ and $_ < $y } @ten_thousand_nums

    sorting would be necessary if you performed some kind of
    chop-search-ish routine to identify the indexes associated with the
    start and end for @select_nums, then just slice them out of
    @ten_thousand_nums, but like I said - I'll leave that for the computer
    scientists... the time to sort the big array would also have to be
    taken into account then since grep could operate on unsorted data.

    > Perhaps the answer involves storing the data in a different structure
    > (I can picture this working very quickly if the values I'm interested
    > in were keys in a hash, but I still don't know how to do it).


    I don't see how this would be any faster in a hash, but I'm all ears...

    -jp
     
    DJ Stunks, Mar 16, 2006
    #3
  4. Alf McLaughlin

    Guest

    "Alf McLaughlin" <> wrote:
    > Hello all-
    > It seems I am running across the following problem a lot recently.
    > Let's say I have a large collection of values and they are in an array
    > (perhaps there are 10,000 of them). I would like to pull out all
    > values between $x and $y without having to loop through the entire
    > array to figure out where the values I'm interested in are locating.


    Why do you want to avoid looping through the entire array? Because it is
    too slow? Or because it takes too much code to implement?

    > Is there a simple way to do this in Perl? For example, in SQL I would
    > write something like this:
    >
    > SELECT position FROM my_table WHERE position BETWEEN 14000 and 1479;


    How do you know the DBMS isn't just looping through the whole table behind
    the scenes? Or is that the point, that you want it to happen behind
    the scenes?

    my @result = grep $_>=14000 && $_<=14790, @my_table;

    or maybe:

    sub between(\@$$) { grep $_>=$_[1] && $_<=$_[2], @{$_[0]} };

    >
    > Perhaps the answer involves storing the data in a different structure
    > (I can picture this working very quickly if the values I'm interested
    > in were keys in a hash, but I still don't know how to do it).


    I don't see how you could make it happen very fast (i.e. substantially
    faster than the ordinary array scan, which is already quite fast for an
    array as small as 10_000) just by simply using a regular hash.

    If the array is sorted, you could do a binary search for the first element
    in the range, then linear search from their to the last.

    You could store the numbers in range-bins a hash (or an array) and then
    limit your inspection to only the necessary bins.

    Both of these require special consideration at the time you put things
    into the list, as well as when you search it.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Mar 16, 2006
    #4
  5. Alf McLaughlin

    Uri Guttman Guest

    >>>>> "DS" == DJ Stunks <> writes:

    DS> Alf McLaughlin wrote:
    >> Hello all-
    >> It seems I am running across the following problem a lot recently.
    >> Let's say I have a large collection of values and they are in an array
    >> (perhaps there are 10,000 of them). I would like to pull out all
    >> values between $x and $y without having to loop through the entire
    >> array to figure out where the values I'm interested in are locating.
    >> Is there a simple way to do this in Perl? For example, in SQL I would
    >> write something like this:
    >>
    >> SELECT position FROM my_table WHERE position BETWEEN 14000 and 1479;


    DS> I'm sure a computer scientist out there will have a faster way, but the
    DS> simplest and easiest comprehensible way is:

    DS> my @selected_nums = grep { $x <= $_ and $_ < $y } @ten_thousand_nums

    DS> sorting would be necessary if you performed some kind of
    DS> chop-search-ish routine to identify the indexes associated with the
    DS> start and end for @select_nums, then just slice them out of
    DS> @ten_thousand_nums, but like I said - I'll leave that for the computer
    DS> scientists... the time to sort the big array would also have to be
    DS> taken into account then since grep could operate on unsorted data.

    >> Perhaps the answer involves storing the data in a different structure
    >> (I can picture this working very quickly if the values I'm interested
    >> in were keys in a hash, but I still don't know how to do it).


    DS> I don't see how this would be any faster in a hash, but I'm all ears...

    a hash would have the same problem as an array, you can't do ordered
    sequential access from any given value. the proper way to do this is
    with a B tree or similar structure. then you have fairly quick access
    O(log N) to any element and very quick access O(1) to the next elements
    in sorted order. this is also famously known as ISAM (indexed sequential
    access method) and was a core feature in cobol and pl/1.

    i wouldn't be surprised if there was some btree stuff on cpan. on disk
    you can get it from most any decent db as it is a common type of index.

    uri

    --
    Uri Guttman ------ -------- http://www.stemsystems.com
    --Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
    Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
     
    Uri Guttman, Mar 16, 2006
    #5
  6. >How do you know the DBMS isn't just looping through the whole table behind
    >the scenes? Or is that the point, that you want it to happen behind
    >the scenes?


    I think maybe the DBMS is using a BTREE once it is properly indexed. I
    think this is going on because if I first create the index:

    CREATE INDEX position ON my_table(position);

    Then I can see the number of rows needed for the search go down
    considerably:

    EXPLAIN SELECT position FROM my_table WHERE position BETWEEN $value1
    and $value2;

    So, yes, an unindexed database search isn't any faster than looping
    through all the values.

    The following module seems to work:

    http://search.cpan.org/~rrwo/Tie-RangeHash-1.03/lib/Tie/RangeHash.pm

    I'm sure there are BTREE modules; that is a great idea.
     
    Alf McLaughlin, Mar 17, 2006
    #6
  7. Alf McLaughlin

    Guest

    "Alf McLaughlin" <> wrote:
    > >How do you know the DBMS isn't just looping through the whole table
    > >behind the scenes? Or is that the point, that you want it to happen
    > >behind the scenes?

    >
    > I think maybe the DBMS is using a BTREE once it is properly indexed. I
    > think this is going on because if I first create the index:

    ....
    >
    > So, yes, an unindexed database search isn't any faster than looping
    > through all the values.


    So from the above, I gather it is not the neat syntax of SQL but rather
    the performance offered by the index you are interested in?

    Just using DBI and SQL might be a good way to go.

    >
    > The following module seems to work:
    >
    > http://search.cpan.org/~rrwo/Tie-RangeHash-1.03/lib/Tie/RangeHash.pm


    I can't seem to make that module work for what you want it to do. Or at
    least not correctly.

    use strict;
    use warnings;
    use Tie::RangeHash;

    my $t=new Tie::RangeHash(Type => Tie::RangeHash::TYPE_NUMBER);
    my @array;

    for (1..10_000) {
    my $x = int rand(2e7);
    $t->add("$x,$x",$x);
    push @array,$x;
    };

    print join "\t", "fetch", (sort {$a<=>$b}
    $t->fetch_overlap("50000,60000"))[1..10];
    print join "\t", "array", sort {$a<=>$b}
    grep $_>=50_000 && $_<=60_000, @array;

    __END__

    fetch 3202 5936 10416 16533 17858 25605 26118 29266 31885 35022
    array 50701 50948 51314 51644 56512 56999 58544


    > I'm sure there are BTREE modules; that is a great idea.


    I've played around with Tree::BPTree. You can get cursors into the tree,
    but the cursors always start at the beginning or end. As far as I can
    tell, you can't ask for a cursor which starts at any given point, which
    seems to stultify the whole point of using BTrees. I started trying to add
    this feature, but I decided the whole thing was slow enough that I might
    as well just use hashes and/or arrays, or a real db with DBI (or a binning
    method with DBM::Deep for one recent job.)

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Mar 17, 2006
    #7
  8. >I can't seem to make that module work for what you want it to do. Or at
    >least not correctly.


    I think I got Tie::RangeHash to do what I want. It seems to work; let
    me know if you notice anything wrong with this. I haven't clocked it
    yet, so this may still be inefficient.

    #!/usr/bin/perl

    BEGIN: {
    use Tie::RangeHash;
    }

    MAIN: {
    my $val1 = 1;
    my $val2 = 100;
    my $val3 = 1000;
    my $val4 = 10000;
    my $val5 = 40000;
    my $val6 = 50000;
    my $range1 = "$val1,$val2";
    my $range2 = "$val3,$val4";
    my $range3 = "$val5,$val6";
    tie %hash, 'Tie::RangeHash', {Type => Tie::RangeHash::TYPE_NUMBER};
    %hash = ($range1 => 'Element1',
    $range2 => 'Element2',
    $range3 => 'Element3');
    my $result = $hash{$ARGV[0]};
    unless($result) { $result = 'Sorry, No Results';}
    print "$result\n";
    }
     
    Alf McLaughlin, Mar 23, 2006
    #8
  9. Alf McLaughlin

    Guest

    "Alf McLaughlin" <> wrote:
    > >I can't seem to make that module work for what you want it to do. Or at
    > >least not correctly.

    >
    > I think I got Tie::RangeHash to do what I want. It seems to work; let
    > me know if you notice anything wrong with this. I haven't clocked it
    > yet, so this may still be inefficient.


    There are only two concerns I have about it (other than efficiency).
    One is that it doesn't do what you originally said you wanted. It takes
    a single-point query, and returns a single "label" for the range that that
    query falls in. Your original specification was that it take a range query
    (a two-point query) and return a list of all the points in that range. So
    originally you wanted a database of points probed with a range query, but
    this is a database of ranges probed with a point query. These are quite
    different things. So I don't know if your original description wasn't
    accurate and you new realize that this is what you really wanted to do
    (hey, it happens), or if it was originally accurate but you lost sight of
    that goal (that happens, too).

    My other concern is that, when I'm using a module which seems to give
    obviously wrong answers under some use-cases, I am always a bit worried
    about whether that same underlying problem could also cause less obvious
    wrong answers in other use-cases. So if I were going to use this module
    for any real work, I'd either trace done the cause of the fetch_overlap bug
    to make sure it doesn't also effect the FETCH method, or I'd do extensive
    tests of the FETCH method by generating large amounts of random data to
    query, and verify that it gives the right answer.

    Cheers,

    Xho


    >
    > #!/usr/bin/perl
    >
    > BEGIN: {
    > use Tie::RangeHash;
    > }
    >
    > MAIN: {
    > my $val1 = 1;
    > my $val2 = 100;
    > my $val3 = 1000;
    > my $val4 = 10000;
    > my $val5 = 40000;
    > my $val6 = 50000;
    > my $range1 = "$val1,$val2";
    > my $range2 = "$val3,$val4";
    > my $range3 = "$val5,$val6";
    > tie %hash, 'Tie::RangeHash', {Type => Tie::RangeHash::TYPE_NUMBER};
    > %hash = ($range1 => 'Element1',
    > $range2 => 'Element2',
    > $range3 => 'Element3');
    > my $result = $hash{$ARGV[0]};
    > unless($result) { $result = 'Sorry, No Results';}
    > print "$result\n";
    > }


    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Mar 23, 2006
    #9
  10. >There are only two concerns I have about it (other than efficiency).
    >One is that it doesn't do what you originally said you wanted.


    Hi Xho-
    The label can also be a data structure that holds the values of the
    low and high ranges. The code below demonstrates this.

    >My other concern is that, when I'm using a module which seems to give
    >obviously wrong answers under some use-cases, I am always a bit worried


    Me too! :) In addition to evaluating speed, I will compare it to my
    subroutine that doesn't use the module to make sure both versions of
    the program are giving me the same results. Best, ALF


    The revised code:

    #!/usr/bin/perl

    BEGIN: {
    use Tie::RangeHash;
    }

    MAIN: {
    my ($val1, $val2, $val3, $val4, $val5, $val6) = (100, 1000, 10000,
    40000, 50000, 100000);
    my $range1 = "$val1,$val2";
    my $range2 = "$val3,$val4";
    my $range3 = "$val5,$val6";
    tie %hash, 'Tie::RangeHash', {Type => Tie::RangeHash::TYPE_NUMBER};
    %hash = ($range1 => [($val1, $val2)],
    $range2 => [($val3, $val4)],
    $range3 => [($val5, $val6)]);
    my ($low_range, $high_range) = ($hash{$ARGV[0]}->[0],
    $hash{$ARGV[0]}->[1]);
    my $result;
    unless($low_range && $high_range) {
    $result = 'Sorry, No Results';
    } else {
    $result = "low: $low_range, high: $high_range";
    }
    print "$result\n";
    }
     
    Alf McLaughlin, Mar 23, 2006
    #10
  11. The results: there are no speed benefits gained by switching to this
    module. :-(

    back to the drawing board!
     
    Alf McLaughlin, Mar 23, 2006
    #11
  12. Alf McLaughlin

    Guest

    "Alf McLaughlin" <> wrote:
    > >There are only two concerns I have about it (other than efficiency).
    > >One is that it doesn't do what you originally said you wanted.

    >
    > Hi Xho-
    > The label can also be a data structure that holds the values of the
    > low and high ranges. The code below demonstrates this.


    Yep, I understand that you can do this. I just don't understand how
    it helps. You are still going from single-point query to a single range
    (expressed as either a label or a pair) as the response, and not from a
    single range as the query to list of points as the response.

    Anyway,

    my ($low_range, $high_range) = ($hash{$ARGV[0]}->[0],
    $hash{$ARGV[0]}->[1]);

    This would be better written as:

    my ($low_range, $high_range) = @{$hash{$ARGV[0]}};

    In your code you are doing the hash FETCH twice, once to feed to ->[0] and
    once to feed to ->[1]. FETCH on a tied hash likely to be the slow step, so
    it is better to only do it once rather than twice.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Mar 23, 2006
    #12
  13. Alf McLaughlin

    Guest

    "Alf McLaughlin" <> wrote:
    > The results: there are no speed benefits gained by switching to this
    > module. :-(
    >
    > back to the drawing board!


    I'm still not exactly sure what you're requirements are. Could you post
    the code which implements, tests, and times your current (slow) hand-rolled
    method and also tests and times your (also slow) Tie::RangeHash method?
    With that in hand I might be able to make a better recommendation. (If the
    testing/timing code is too long to post here, you can email me.)

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Mar 23, 2006
    #13
  14. woops! you are correct... i forgot that it wasn't a normal hash. :)
     
    Alf McLaughlin, Mar 23, 2006
    #14
  15. i'll email it after i clean some things up.
     
    Alf McLaughlin, Mar 23, 2006
    #15
  16. Alf McLaughlin

    Anno Siegel Guest

    Alf McLaughlin <> wrote in comp.lang.perl.misc:
    > Hello all-
    > It seems I am running across the following problem a lot recently.
    > Let's say I have a large collection of values and they are in an array
    > (perhaps there are 10,000 of them). I would like to pull out all
    > values between $x and $y without having to loop through the entire
    > array to figure out where the values I'm interested in are locating.
    > Is there a simple way to do this in Perl? For example, in SQL I would
    > write something like this:
    >
    > SELECT position FROM my_table WHERE position BETWEEN 14000 and 1479;
    >
    > Perhaps the answer involves storing the data in a different structure


    [Ignoring the other thread that has sprouted, I'm going back to the OP]

    It seems to me the data structure you are looking for is a sorted list.
    Then you can use binary search to find the first element above the lower
    end of the interval, and the last element below the upper end. The elements
    in between are all the list elements in the interval.

    Of course, sorting only pays when you have many intervals to check
    against the same set of data. Otherwise, a linear search would be faster.

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
     
    Anno Siegel, Mar 23, 2006
    #16
  17. > It seems to me the data structure you are looking for is a sorted list.
    > Then you can use binary search to find the first element above the lower
    > end of the interval, and the last element below the upper end. The elements
    > in between are all the list elements in the interval.
    >
    > Of course, sorting only pays when you have many intervals to check
    > against the same set of data. Otherwise, a linear search would be faster.
    >
    > Anno


    Thanks, Anno. I am realizing after posting with Xho that I did not
    state my question precisely right, but you did deduce what I am
    interested in I think. Basically, I have a list of 500 values and I
    would like to take another list of values (which happens to be 80 long
    at the moment, but this could be longer as well) and for each of these
    80 values I would like to find which 2 of the 500 values is the next
    greater value and the next lesser value. I've been doing this (in
    pseudocode, please take syntax mistakes and finer points lightly):

    my @val1 = (my 80 values); my @val2 = (my 500 values);

    foreach my $i (@val1) {
    for (my $j = 0; $j < @val2 - 1; $j++) {
    my $current = $val2[$j];
    my $next = $val2[$j + 1];
    if ($i > $current && $i < $next) {
    #then this value is between $current and next.
    }
    }
    }

    In one example I have, it was necessary to loop through this 8000
    times. I may be interested in running this program perhaps 1 million
    times on a different set of 80 values; you can start to see how many
    times I will have to loop through the 500 values and how much work the
    computer will have to do. With one example I have, it took me 16
    seconds to compute this 1000 times. Since I am interested in running
    this perhaps millions of times, coding the problem this way in Perl may
    end up taking my program an additional day or two to run.

    I hope I have described the problem more clearly now. Does it still
    sound as if a binary search on a sorted array is the way to go?

    Also, I plan on re-coding the same program is C or C++ to see if the
    code speeds up; I suspect the gain in performance may make the program
    run in a reasonable amount of time without having to use a more
    sophisticated algorithm.

    Thanks for the help!
     
    Alf McLaughlin, Mar 24, 2006
    #17
  18. > It seems to me the data structure you are looking for is a sorted list.
    > Then you can use binary search to find the first element above the lower
    > end of the interval, and the last element below the upper end. The elements
    > in between are all the list elements in the interval.
    >
    > Of course, sorting only pays when you have many intervals to check
    > against the same set of data. Otherwise, a linear search would be faster.
    >
    > Anno


    Thanks, Anno. I am realizing after posting with Xho that I did not
    state my question precisely right, but you did deduce what I am
    interested in I think. Basically, I have a list of 500 values and I
    would like to take another list of values (which happens to be 80 long
    at the moment, but this could be longer as well) and for each of these
    80 values I would like to find which 2 of the 500 values is the next
    greater value and the next lesser value. I've been doing this (in
    pseudocode, please take syntax mistakes and finer points lightly):

    my @val1 = (my 80 values); my @val2 = (my 500 values);

    foreach my $i (@val1) {
    for (my $j = 0; $j < @val2 - 1; $j++) {
    my $current = $val2[$j];
    my $next = $val2[$j + 1];
    if ($i > $current && $i < $next) {
    #then this value is between $current and next.
    }
    }
    }

    In one example I have, it was necessary to loop through this 8000
    times. I may be interested in running this program perhaps 1 million
    times on a different set of 80 values; you can start to see how many
    times I will have to loop through the 500 values and how much work the
    computer will have to do. With one example I have, it took me 16
    seconds to compute this 1000 times. Since I am interested in running
    this perhaps millions of times, coding the problem this way in Perl may
    end up taking my program an additional day or two to run.

    I hope I have described the problem more clearly now. Does it still
    sound as if a binary search on a sorted array is the way to go?

    Also, I plan on re-coding the same program is C or C++ to see if the
    code speeds up; I suspect the gain in performance may make the program
    run in a reasonable amount of time without having to use a more
    sophisticated algorithm.

    Thanks for the help!
     
    Alf McLaughlin, Mar 24, 2006
    #18
  19. Alf McLaughlin

    Anno Siegel Guest

    Alf McLaughlin <> wrote in comp.lang.perl.misc:
    > > It seems to me the data structure you are looking for is a sorted list.
    > > Then you can use binary search to find the first element above the lower
    > > end of the interval, and the last element below the upper end. The elements
    > > in between are all the list elements in the interval.
    > >
    > > Of course, sorting only pays when you have many intervals to check
    > > against the same set of data. Otherwise, a linear search would be faster.
    > >
    > > Anno

    >
    > Thanks, Anno. I am realizing after posting with Xho that I did not
    > state my question precisely right, but you did deduce what I am
    > interested in I think. Basically, I have a list of 500 values and I
    > would like to take another list of values (which happens to be 80 long
    > at the moment, but this could be longer as well) and for each of these
    > 80 values I would like to find which 2 of the 500 values is the next
    > greater value and the next lesser value.


    That's exactly what binary search does, in log n time for lists of
    length n.

    # return the index $i, such that $a->[$i] <= $x and $x < $a->[$i+1]
    # if no such index exists, return empty
    sub binsearch {
    my ( $x, $a) = @_;
    return if $x < $a->[ 0] or $x >= $a->[ -1];
    my ( $l, $h) = ( 0, $#$a + 1);
    while ( $h - $l > 1 ) {
    my $mid = int( ($l + $h)/2);
    $a->[ $mid] <= $x ? $l : $h = $mid;
    }
    return $l;
    }


    > I've been doing this (in
    > pseudocode, please take syntax mistakes and finer points lightly):
    >
    > my @val1 = (my 80 values); my @val2 = (my 500 values);
    >
    > foreach my $i (@val1) {
    > for (my $j = 0; $j < @val2 - 1; $j++) {
    > my $current = $val2[$j];
    > my $next = $val2[$j + 1];
    > if ($i > $current && $i < $next) {


    One of the comparisons should be via "<=" (">="). Otherwise it will
    miss values that appear in the list exactly.

    > #then this value is between $current and next.
    > }
    > }
    > }


    Apparently you are already supposing that the list @val2 is sorted. Your
    search time is linear in the size of @val2. Binary search will greatly
    speed it up.

    [rest snipped]

    Anno
    --
    If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers.
     
    Anno Siegel, Mar 24, 2006
    #19
  20. Sweet! Thanks a lot everyone who helped out. I implemented this
    algorithm into my program and it is 10 times faster now (probably good
    enough, but I will re-code to C if I require more speed). I consider
    the issue resolved (for me, anyway). Here is a mini program that
    demonstrates (more or less) what I implemented:

    #!/usr/bin/perl

    MAIN: {
    my @test_array = qw(60 100 1500 3000);
    unless ($ARGV[0]) { die "ERROR: enter query.\n"; }
    my $query = $ARGV[0];
    my $first_element = $test_array[0];
    my $array_length = @test_array;
    my $last_element = $test_array[$array_length - 1];
    if ($query < $first_element || $query > $last_element) {
    die "ERROR: $query is out of range.\n";
    }
    my $ra_val = bsearch($query, \@test_array);
    if (@$ra_val == 1) {
    #then we found the exact value:
    my $exact = $ra_val->[0];
    print "EXACT found: $exact\n";
    } else {
    my $below_val = $ra_val->[0];
    my $above_val = $ra_val->[1];
    print "RANGE: $below_val\t$above_val\n";
    }
    }

    sub bsearch {
    my ($x, $a) = @_; # search for x in array a
    my ($l, $u) = (0, @$a - 1); # lower, upper end of search interval
    my $i; # index of probe
    while ($l <= $u) {
    $i = int(($l + $u)/2);
    if ($a->[$i] < $x) {
    $l = $i+1;
    }
    elsif ($a->[$i] > $x) {
    $u = $i-1;
    }
    else {
    return [($a->[$i])]; # found
    }
    }
    return [($a->[$u], $a->[$l])];
    }
     
    Alf McLaughlin, Mar 24, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?U2FjaGk=?=

    Need to pull out referance details

    =?Utf-8?B?U2FjaGk=?=, Sep 2, 2005, in forum: ASP .Net
    Replies:
    1
    Views:
    341
    =?Utf-8?B?Q2xpbnQgSGlsbA==?=
    Sep 2, 2005
  2. Bimo Remus

    pull out first and last words

    Bimo Remus, Jun 28, 2003, in forum: C++
    Replies:
    6
    Views:
    2,117
    Samuele Armondi
    Jun 29, 2003
  3. Replies:
    2
    Views:
    2,224
    Mike Treseler
    Jun 28, 2006
  4. Kun
    Replies:
    1
    Views:
    353
    Arne Ludwig
    Mar 25, 2006
  5. Replies:
    46
    Views:
    996
    Antoon Pardon
    Jul 25, 2006
Loading...

Share This Page