Should be a simple parsing problem

Discussion in 'Perl Misc' started by Tim, Apr 7, 2007.

  1. Tim

    Tim Guest

    Hello all, I am trying to get a simple Perl script working to
    transform some data, but have an incredibly onerous solution that
    looks more like visual basic than perl using lots of conditional
    statements, while loops and the shift function. Something inside
    doesn't feel right about that, plus I think I am missing out on
    expanding my limited Perl knowledge.

    My data is in the general form:
    _________________________
    TEXT {
    title: test
    font: script
    }

    POLYGON {
    name: foo
    type: good
    POINTS 3 {
    4,3
    2,16
    633,2
    }
    }

    JUNK {
    title: nothing
    }

    POLYGON {
    name: foo2
    type: bad
    POINTS 2 {
    7,9
    3,2
    }

    }

    Now I want to extract the points where the polygon type is 'good' so
    my
    output would be:
    4,3
    2,16
    633,2

    I can get Text::Balanced to work on a single line, but don't know how
    to elegantly parse down to the fields I need. Any thoughts would be
    greatly appreciated. Not committed to Text::Balanced, but it seems
    like it should work.

    best,

    Tim
     
    Tim, Apr 7, 2007
    #1
    1. Advertising

  2. On Sat, 07 Apr 2007 04:21:22 -0700, Tim wrote:

    > Hello all, I am trying to get a simple Perl script working to transform
    > some data, but have an incredibly onerous solution that looks more like
    > visual basic than perl using lots of conditional statements, while loops
    > and the shift function. Something inside doesn't feel right about that,
    > plus I think I am missing out on expanding my limited Perl knowledge.
    >
    > My data is in the general form:
    > _________________________
    > TEXT {
    > title: test
    > font: script
    > }
    >
    > POLYGON {
    > name: foo
    > type: good
    > POINTS 3 {
    > 4,3
    > 2,16
    > 633,2
    > }
    > }
    >
    > JUNK {
    > title: nothing
    > }
    >
    > POLYGON {
    > name: foo2
    > type: bad
    > POINTS 2 {
    > 7,9
    > 3,2
    > }
    >
    > }
    >
    > Now I want to extract the points where the polygon type is 'good' so my
    > output would be:
    > 4,3
    > 2,16
    > 633,2
    >
    > I can get Text::Balanced to work on a single line, but don't know how to
    > elegantly parse down to the fields I need. Any thoughts would be greatly
    > appreciated. Not committed to Text::Balanced, but it seems like it
    > should work.
    >


    OTTOMH:

    while (<>) {
    /^POLYGON\s+{/ and do {
    while (<>) {
    /\stype:\sgood\s*$/ {
    //handle point here in the same way
    /^}\s*$/ and last;
    }
    /^}\s*$/ and last;
    }
    }
    }

    This obviously assumes your input is always formatted in the same way, is
    always correct and type: comes before the POINT.

    HTH
    M4
     
    Martijn Lievaart, Apr 7, 2007
    #2
    1. Advertising

  3. Tim <> wrote:

    > Now I want to extract the points where the polygon type is 'good' so
    > my
    > output would be:
    > 4,3
    > 2,16
    > 633,2



    --------------------------
    #!/usr/bin/perl
    use warnings;
    use strict;
    use Text::Balanced 'extract_bracketed';

    local $/ = ''; # enable paragraph mode, see perlvar.pod

    while ( <DATA> ) {
    next unless /type: good.*POINTS\s+\d+/gs;
    my $bracketed = extract_bracketed();
    print "$bracketed\n";
    }


    __DATA__
    TEXT {
    title: test
    font: script
    }

    POLYGON {
    name: foo
    type: good
    POINTS 3 {
    4,3
    2,16
    633,2
    }
    }

    JUNK {
    title: nothing
    }

    POLYGON {
    name: foo2
    type: bad
    POINTS 2 {
    7,9
    3,2
    }

    }
    --------------------------


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Apr 7, 2007
    #3
  4. Tim

    Tim Guest

    On Apr 7, 9:21 am, Martijn Lievaart <> wrote:
    > On Sat, 07 Apr 2007 04:21:22 -0700, Tim wrote:
    > > Hello all, I am trying to get a simple Perl script working to transform
    > > some data, but have an incredibly onerous solution that looks more like
    > > visual basic than perl using lots of conditional statements, while loops
    > > and the shift function. Something inside doesn't feel right about that,
    > > plus I think I am missing out on expanding my limited Perl knowledge.

    >
    > > My data is in the general form:
    > > _________________________
    > > TEXT {
    > > title: test
    > > font: script
    > > }

    >
    > > POLYGON {
    > > name: foo
    > > type: good
    > > POINTS 3 {
    > > 4,3
    > > 2,16
    > > 633,2
    > > }
    > > }

    >
    > > JUNK {
    > > title: nothing
    > > }

    >
    > > POLYGON {
    > > name: foo2
    > > type: bad
    > > POINTS 2 {
    > > 7,9
    > > 3,2
    > > }

    >
    > > }

    >
    > > Now I want to extract the points where the polygon type is 'good' so my
    > > output would be:
    > > 4,3
    > > 2,16
    > > 633,2

    >
    > > I can get Text::Balanced to work on a single line, but don't know how to
    > > elegantly parse down to the fields I need. Any thoughts would be greatly
    > > appreciated. Not committed to Text::Balanced, but it seems like it
    > > should work.

    >
    > OTTOMH:
    >
    > while (<>) {
    > /^POLYGON\s+{/ and do {
    > while (<>) {
    > /\stype:\sgood\s*$/ {
    > //handle point here in the same way
    > /^}\s*$/ and last;
    > }
    > /^}\s*$/ and last;
    > }
    > }
    >
    > }
    >
    > This obviously assumes your input is always formatted in the same way, is
    > always correct and type: comes before the POINT.
    >
    > HTH
    > M4


    This is great, and very instructive. Thank you so much, so what I
    wasn't understanding is how you can nest while(<>) statements. I have
    to think a bit more about what is really going on there, but this is
    what I was looking for: code that works more in line with how I think
    instead of going line-by line and doing careful book-keeping. I also
    need to look up the 'and last' statement and see what that is doing.
    Thanks again.

    Tim
     
    Tim, Apr 8, 2007
    #4
  5. On Sat, 07 Apr 2007 19:27:38 -0700, Tim wrote:

    > This is great, and very instructive. Thank you so much, so what I wasn't
    > understanding is how you can nest while(<>) statements. I have to think
    > a bit more about what is really going on there, but this is what I was
    > looking for: code that works more in line with how I think instead of
    > going line-by line and doing careful book-keeping. I also need to look
    > up the 'and last' statement and see what that is doing. Thanks again.


    I like Tads solution much better, but fwiw:

    - Yes you can nest while (<>) like this. Normally it is more trouble than
    it's worth, but in this case it is appropriate. Just remember that you
    are reading the same file with the same filepointer so if you get to
    another while (<>) (or back to) that reads on where the last one left of.

    - The "<condition> and last;" construct is another way of saying "if
    (<condition>) { last; }",, only shorter. Often seen like this:

    # parse config file
    while (<$fh>) {
    /^\s*$/ and continue; # skip empty lines
    /^\s*#/ and continue; # skip comments
    ....
    }

    HTH,
    M4
     
    Martijn Lievaart, Apr 8, 2007
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. ~~~ .NET Ed ~~~

    How should control images should be handled?

    ~~~ .NET Ed ~~~, Oct 31, 2004, in forum: ASP .Net Building Controls
    Replies:
    1
    Views:
    253
    John Saunders
    Nov 3, 2004
  2. Josef 'Jupp' SCHUGT

    What the FAQs should and should not contain

    Josef 'Jupp' SCHUGT, Aug 19, 2005, in forum: Ruby
    Replies:
    0
    Views:
    205
    Josef 'Jupp' SCHUGT
    Aug 19, 2005
  3. botp
    Replies:
    6
    Views:
    229
    Joel VanderWerf
    Oct 5, 2010
  4. John Levine
    Replies:
    0
    Views:
    747
    John Levine
    Feb 2, 2012
  5. ft310
    Replies:
    2
    Views:
    130
Loading...

Share This Page