eof and nested while (<$fh>) {...}

Discussion in 'Perl Misc' started by Greg Bacon, Jun 24, 2004.

  1. Greg Bacon

    Greg Bacon Guest

    I was writing code to scan an assembly-language definition of
    operational data and produce a report and ended up writing code
    that gave me the "there has to be a better way" feeling.

    Single parameters are easy to spot, e.g.,

    label1 .word 1234ABCDh
    label2 .float 3.14159

    Most arrays are trivial too:

    label3 .word 1, 2, 3

    Array specifications can span multiple lines, however. For example:

    label4 .float 0.0, 0.5, 1.0
    .float 1.5, 2.0, 2.5

    At first, I used a regular expression to feed individual values into
    a sub that kept track of the last label grabbed and determined whether
    the current value was a new parameter or a continuation of an array.

    The code -- and the approach, really -- was unsatisfying, so I
    considered a two-pass scan: grab and decompose the chunks and then
    coalesce the arrays in a second pass. I made a start in that
    direction but didn't like the way it was playing out.

    I saw that scanning an entire array would be straightforward too.
    I could safely look ahead. Lines without labels continued the
    current array, and I could pretend the values were on one line by
    appending to the end of what I've already recognized.

    If the lookahead line had a label, I could process what I had and then
    C<redo> to process the lookahead line that's already in $_.

    Here's a sketch of the code:

    while (<$fh>) {
    next unless /^(\w+)\s+\.(word|float)\s+(.+?),?\s*$/;
    my($label,$type,$data) = ($1,$2,$3);


    # look for continued spec
    my $needredo = 0;
    while (<$fh>) {
    if (/^\s*\.(word|float)\s+(.+?),?\s*$/) {
    $data .= ", $2";
    }
    else {
    $needredo = 1;
    }
    }

    # now $label, $type, and $data comprise an
    # entire parameter
    ...;

    redo if $needredo;
    }

    That's already kind of klunky, but I also saw that the inner while loop
    will exhaust the input, which the outer loop's implicitly tests too. I
    tested for C<eof $fh> at each iteration of the inner loop and reset
    $needredo if I needed to C<last> out of the inner loop.

    The code now feels very klunky. Is there a more elegant way to code
    this scan?

    Greg
    --
    A democracy is nothing more than mob rule, where fifty-one percent of
    the people may take away the rights of the other forty-nine.
    -- Thomas Jefferson
     
    Greg Bacon, Jun 24, 2004
    #1
    1. Advertising

  2. Greg Bacon

    Steven Kuo Guest

    On Thu, 24 Jun 2004, Greg Bacon wrote:

    > I was writing code to scan an assembly-language definition of
    > operational data and produce a report and ended up writing code
    > that gave me the "there has to be a better way" feeling.
    >
    > Single parameters are easy to spot, e.g.,
    >
    > label1 .word 1234ABCDh
    > label2 .float 3.14159
    >
    > Most arrays are trivial too:
    >
    > label3 .word 1, 2, 3
    >
    > Array specifications can span multiple lines, however. For example:
    >
    > label4 .float 0.0, 0.5, 1.0
    > .float 1.5, 2.0, 2.5
    >


    (snipped)

    > Here's a sketch of the code:
    >
    > while (<$fh>) {
    > next unless /^(\w+)\s+\.(word|float)\s+(.+?),?\s*$/;
    > my($label,$type,$data) = ($1,$2,$3);
    >
    >
    > # look for continued spec
    > my $needredo = 0;
    > while (<$fh>) {
    > if (/^\s*\.(word|float)\s+(.+?),?\s*$/) {
    > $data .= ", $2";
    > }
    > else {
    > $needredo = 1;
    > }
    > }
    >
    > # now $label, $type, and $data comprise an
    > # entire parameter
    > ...;
    >
    > redo if $needredo;
    > }
    >
    > That's already kind of klunky, but I also saw that the inner while loop
    > will exhaust the input, which the outer loop's implicitly tests too. I
    > tested for C<eof $fh> at each iteration of the inner loop and reset
    > $needredo if I needed to C<last> out of the inner loop.
    >
    > The code now feels very klunky. Is there a more elegant way to code
    > this scan?
    >





    I don't think nested loops are needed. How about:

    #!/usr/local/bin/perl
    use strict;
    use warnings;

    my ($label, $type, $data);

    while (<DATA>) {
    if (/^\s*\.(word|float)\s+(.+?),?\s*$/) {
    $data .= ", $2";
    } elsif (/^(\w+)\s+\.(word|float)\s+(.+?),?\s*$/) {
    do_stuff($label, $type, $data) if ($label); # previously found label
    ($label, $type, $data) = ($1, $2, $3);
    }

    }

    do_stuff($label, $type, $data) if ($label);

    sub do_stuff {
    print "$label, $type, $data\n";
    }

    __DATA__

    label1 .word 1234ABCDh
    label2 .float 3.14159

    label3 .word 1, 2, 3

    label4 .float 0.0, 0.5, 1.0
    .float 1.5, 2.0, 2.5


    --
    Hope this helps,
    Steven
     
    Steven Kuo, Jun 24, 2004
    #2
    1. Advertising

  3. Greg Bacon

    Anno Siegel Guest

    Greg Bacon <> wrote in comp.lang.perl.misc:
    > I was writing code to scan an assembly-language definition of
    > operational data and produce a report and ended up writing code
    > that gave me the "there has to be a better way" feeling.
    >
    > Single parameters are easy to spot, e.g.,
    >
    > label1 .word 1234ABCDh
    > label2 .float 3.14159
    >
    > Most arrays are trivial too:
    >
    > label3 .word 1, 2, 3
    >
    > Array specifications can span multiple lines, however. For example:
    >
    > label4 .float 0.0, 0.5, 1.0
    > .float 1.5, 2.0, 2.5


    [snip]

    Ah, ye olde continuation line problem. Subtype 2, where you know if a
    line *is* a continuation but not if a line *has* a continuation.

    Here is one way of doing that. A continuation line is one that starts
    with 10 blanks.

    my $coll = '';
    while ( <DATA> ) {
    chomp;
    if ( substr( $_, 0, 10) =~ /\S/ ) {
    print "$coll\n" if length $coll;
    $coll = $_;
    } else {
    $coll .= $_;
    }
    }
    print "$coll\n" if length $coll;

    This only collects continued lines into one. It would be simple
    to add further processing to the loop so that it spits out ready-
    to-use records.

    Anno
     
    Anno Siegel, Jun 29, 2004
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Kobu
    Replies:
    10
    Views:
    640
    Keith Thompson
    Mar 4, 2005
  2. invni
    Replies:
    36
    Views:
    900
    Robert Maas, see http://tinyurl.com/uh3t
    Jul 27, 2005
  3. oscartheduck
    Replies:
    7
    Views:
    5,361
    Steve Holden
    Apr 4, 2007
  4. SpreadTooThin

    ifstream eof not reporting eof?

    SpreadTooThin, Jun 13, 2007, in forum: C++
    Replies:
    10
    Views:
    706
    James Kanze
    Jun 15, 2007
  5. Jan Burse
    Replies:
    67
    Views:
    1,082
    Jan Burse
    Mar 14, 2012
Loading...

Share This Page