Continue to Search file after matching a value

Discussion in 'Perl Misc' started by deadpickle, Feb 13, 2007.

  1. deadpickle

    deadpickle Guest

    This is what I want the program to do:
    1. Read in a file containing:
    KBQP 071845Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=
    KBQP 071905Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=
    KHLR 071856Z AUTO 19010KT 10SM CLR 22/13 A3007 RMK AO2 SLP179
    KBQP 071925Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=
    2. Search for strings beginning with K at the beginning
    EXAMPLE: next unless $obs =~ m/^(K)/;
    3. When they are found load them into an array
    EXAMPLE: my @a = split(" ", $obs);
    4. Next, continue to search the file. But instead of searching for any
    string beginning with K it instead searches for and string beginning
    with the first four letters of the station ID.
    EXAMPLE: First time through finds that $a[0] = KBQP
    Continues through the file until it finds
    another KBQP
    5. After it has found another station ID with the same name as shown
    in #4, it then checks the next value in BOTH arrays and compares them.
    The object is to see which string is the newest.
    6. The newer string gets wrote to the array and the program continues
    to search for the same ID.
    7. If no other similar IDs are found, go back to step 1.

    The flow should look something like this:
    Search for "K"
    Found "KBQP" -> Load into array
    Search for "KBQP"
    Found "KBQP" -> comparing times of the observations
    Second "KBQP" found is newer -> replacing the array with newer "KBQP"
    Search for "KBQP"
    If No newer "KBQP" found -> Search For "K"

    I hope this is clear. My problem is that I have no clue how to do this
    after step 3. Any help would be appreciated.

    Code So far:
    ==============================================================
    use strict;
    use warnings;
    $\ = "\n";
    my $wmo = "07020719f.wmo";
    open OUT,'>', 'sub.txt' or die "cannot open 'sub.txt' $!";
    open WMO, '<', $wmo;
    while (my $obs = <WMO>) {
    next unless $obs =~ m/^(K)/;
    my @a = split(" ", $obs);


    }
    close OUT;
     
    deadpickle, Feb 13, 2007
    #1
    1. Advertising

  2. deadpickle

    Guest

    "deadpickle" <> wrote:
    > This is what I want the program to do:
    > 1. Read in a file containing:
    > KBQP 071845Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=
    > KBQP 071905Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=
    > KHLR 071856Z AUTO 19010KT 10SM CLR 22/13 A3007 RMK AO2 SLP179
    > KBQP 071925Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=
    > 2. Search for strings beginning with K at the beginning


    In your example, that is all of them.

    > EXAMPLE: next unless $obs =~ m/^(K)/;
    > 3. When they are found load them into an array
    > EXAMPLE: my @a = split(" ", $obs);
    > 4. Next, continue to search the file. But instead of searching for any
    > string beginning with K it instead searches for and string beginning
    > with the first four letters of the station ID.


    So then "last" out of the loop and start a different loop.

    > EXAMPLE: First time through finds that $a[0] = KBQP
    > Continues through the file until it finds
    > another KBQP
    > 5. After it has found another station ID with the same name as shown
    > in #4, it then checks the next value in BOTH arrays and compares them.
    > The object is to see which string is the newest.
    > 6. The newer string gets wrote to the array and the program continues
    > to search for the same ID.
    > 7. If no other similar IDs are found, go back to step 1.


    "similar" ne "same". Which do you want?

    At this point, You've reached the end of the file, so how does one go back
    to step 1? There is nothing left to process. Do you have to rewind in the
    file to some previously remembered landmark? If so, see "tell" and "seek".

    But really, if you would need to rewind, I think you are going about this
    fundamentally the wrong way. Use a hash on the station name, and store
    in it the "newest" string encountered so far. At the end, print out all
    these station/string pairs.

    > Code So far:
    > ==============================================================
    > use strict;
    > use warnings;
    > $\ = "\n";
    > my $wmo = "07020719f.wmo";
    > open OUT,'>', 'sub.txt' or die "cannot open 'sub.txt' $!";
    > open WMO, '<', $wmo;


    my %station;

    > while (my $obs = <WMO>) {
    > next unless $obs =~ m/^(K)/;
    > my @a = split(" ", $obs);


    my ($station, $other) = split / /, $obs,2;
    if (not exists $station{$station} or
    newer_than($other,$station{$station}) )
    { $station{$station}=$other; };

    >
    > }
    > close OUT;


    while (my ($k,$v)=each %station) {
    print "$k\t$v\n"; #or whatever format you want
    };

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Feb 13, 2007
    #2
    1. Advertising

  3. deadpickle

    Guest

    On Feb 13, 2:35 pm, "deadpickle" <> wrote:
    > 4. Next, continue to search the file. But instead of searching for any
    > string beginning with K it instead searches for and string beginning
    > with the first four letters of the station ID.

    ....
    > 7. If no other similar IDs are found, go back to step 1.


    Gah! That's a convoluted mess! Run away, run away!

    Just use a hash to keep track of the highest found value for each
    identifier as you iterate over the file (once), such as:

    #!/usr/bin/perl
    use strict; use warnings;

    my %info;
    while (<DATA>) {
    my $letters = (split)[0];
    $info{$letters} = $_ if $info{$letters} lt $_;
    }
    print sort values %info;

    __DATA__
    KBQP 071845Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=
    KBQP 071905Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=
    KHLR 071856Z AUTO 19010KT 10SM CLR 22/13 A3007 RMK AO2 SLP179
    KBQP 071925Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=


    --
    The best way to get a good answer is to ask a good question.
    David Filmer (http://DavidFilmer.com)
     
    , Feb 13, 2007
    #3
  4. deadpickle <> wrote:

    > This is what I want the program to do:



    Please see the Posting Guidelines that are posted here frequently.


    > 1. Read in a file containing:
    > KBQP 071845Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=
    > KBQP 071905Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=
    > KHLR 071856Z AUTO 19010KT 10SM CLR 22/13 A3007 RMK AO2 SLP179
    > KBQP 071925Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=



    You should use the __DATA__ token for providing file data.

    If your aim is to find the "newest" one, then your sample data
    should probably NOT be sorted oldest-to-newest...


    > 2. Search for strings beginning with K at the beginning
    > EXAMPLE: next unless $obs =~ m/^(K)/;
    > 3. When they are found load them into an array
    > EXAMPLE: my @a = split(" ", $obs);



    Is it OK if you can accomplish what you want without loading
    it into an array?


    > 4. Next, continue to search the file. But instead of searching for any
    > string beginning with K it instead searches for and string beginning
    > with the first four letters of the station ID.
    > EXAMPLE: First time through finds that $a[0] = KBQP
    > Continues through the file until it finds
    > another KBQP
    > 5. After it has found another station ID with the same name as shown
    > in #4, it then checks the next value in BOTH arrays and compares them.
    > The object is to see which string is the newest.



    You have not mentioned how to interpret what is "newest", so I'll
    just go with the greatest string-wise.


    > 6. The newer string gets wrote to the array and the program continues
    > to search for the same ID.
    > 7. If no other similar IDs are found, go back to step 1.



    Blech!

    If there are 50 radio stations your algorithm will read the same
    file 50 times?

    That is just toooo wasteful.


    > The flow should look something like this:



    Why do you care what the flow looks like?

    Shouldn't you instead care about whether or not it makes the
    correct output, even if it uses a different flow?


    > I hope this is clear.



    You want the lines with the greatest (newest) time for each
    radio station that appears in the data.

    Right?


    > My problem is that I have no clue how to do this
    > after step 3. Any help would be appreciated.
    >
    > Code So far:
    >==============================================================
    > use strict;
    > use warnings;



    Good. Very good.

    Thank you.


    > $\ = "\n";


    > open OUT,'>', 'sub.txt' or die "cannot open 'sub.txt' $!";



    Your code never makes any output, so those two lines are not
    necesary to illustrate your problem.

    If you choose an appropriate data structure, the algorithm gets
    quite simple:

    ---------------------------------
    #!/usr/bin/perl
    use warnings;
    use strict;

    my %stations;
    while ( <DATA> ) {
    next unless /^(K[A-Z]+)\s+(\S+)/; # does not start with "K"

    if ( not exists $stations{$1}
    or
    $2 gt $stations{$1}{time}) {
    $stations{$1}{time} = $2;
    $stations{$1}{line} = $_;
    }
    }

    foreach my $station ( keys %stations ) {
    print $stations{$station}{line};
    }


    __DATA__
    KBQP 071845Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=
    KBQP 071905Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=
    KHLR 071900Z AUTO 19010KT 10SM CLR 22/13 A3007 RMK AO2 SLP179
    KBQP 071925Z AUTO 35003KT 7SM OVC012 14/12 A3018 RMK AO2=
    KHLR 071856Z AUTO 19010KT 10SM CLR 22/13 A3007 RMK AO2 SLP179
    ---------------------------------


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Feb 14, 2007
    #4
  5. deadpickle <> wrote:

    > open OUT,'>', 'sub.txt' or die "cannot open 'sub.txt' $!";
    > open WMO, '<', $wmo;



    You should always, yes *always*, check the return value from open().

    You are already checking the 1st one, why stop when you got to the 2nd one?


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Feb 14, 2007
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Guy Lateur
    Replies:
    1
    Views:
    307
    Paul McNett
    Jun 9, 2005
  2. Ivan Shevanski
    Replies:
    7
    Views:
    357
    Mike Meyer
    Jun 11, 2005
  3. chad
    Replies:
    2
    Views:
    394
  4. Krister Svanlund
    Replies:
    4
    Views:
    716
    Ryan Kelly
    Feb 21, 2010
  5. Les Caudle
    Replies:
    1
    Views:
    134
    Dominick Baier [DevelopMentor]
    Jun 2, 2005
Loading...

Share This Page