Matching block of text awk-like: /---/,/---/

Discussion in 'Perl Misc' started by A. Farber, Jun 2, 2009.

  1. A. Farber

    A. Farber Guest

    Hello,

    I'd like to extract few lines of input,
    delimited by the "-----" lines, i.e.

    ---------------
    blah
    bleh
    blue
    ---------------

    Is there a nice way in Perl to do that,
    maybe awk-like /^---/,/^---/ ?

    I have forgotten how to do it nicely in Perl,
    and it's difficult to find this case in docs/Google

    Thanks
    Alex
    A. Farber, Jun 2, 2009
    #1
    1. Advertising

  2. A. Farber

    A. Farber Guest

    Here is how I've done it sofar, but I still wonder
    if /^---/,/^---/ or alike is supported in Perl:


    #!C:\Perl\bin\perl.exe -w

    use strict;
    use Data::Dumper;

    my $CMD = 'javaloader dir';
    my %cods;
    my $seen;

    open my $pipe, "$CMD |" or die "Can't run $CMD: $!";
    while (<$pipe>) {
    $seen = !$seen if /^-----/;

    $cods{lc $1} = lc $2 if $seen && /^(\w+)\s+(.+)$/;
    }
    close $pipe or die "Can't close $CMD: $!";

    print Dumper(\%cods);
    A. Farber, Jun 2, 2009
    #2
    1. Advertising

  3. A. Farber

    Uri Guttman Guest

    >>>>> "AF" == A Farber <> writes:

    AF> Hello,
    AF> I'd like to extract few lines of input,
    AF> delimited by the "-----" lines, i.e.

    AF> ---------------
    AF> blah
    AF> bleh
    AF> blue
    AF> ---------------

    AF> Is there a nice way in Perl to do that,
    AF> maybe awk-like /^---/,/^---/ ?

    AF> I have forgotten how to do it nicely in Perl,
    AF> and it's difficult to find this case in docs/Google

    use the range operator .. in a scalar context. this is also known as the
    flip flop (or bistable) operator. it generally behaves like awk's range
    of patterns but is more general purpose and isn't tied to just a pair of
    patterns and lines like awk.

    but depending on the markers and the file size, these days i prefer to
    slurp in the file, match the delimiters and grab the content in a single
    regex (awk could never do this easily). it is generally fast than line
    by line match/grab with range and it is much simpler code. here is rough
    untested code:

    use File::Slurp ;

    my $text = read_file( $file_name ) ;

    # you need /m to allow ^ and $ to match line stuff inside the string
    # you need /s to allow . to match \n in the grabbed chunk assuming it is
    # multiple lines of text as you show

    while( $text =~ /^-+$(.+?)^-+$/msg ) {

    process_stuff( $1 ) ;
    }

    as i said, that is simpler, faster and easier code than range can
    do. and awk can't go near it.

    uri

    --
    Uri Guttman ------ -------- http://www.sysarch.com --
    ----- Perl Code Review , Architecture, Development, Training, Support ------
    --------- Free Perl Training --- http://perlhunter.com/college.html ---------
    --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
    Uri Guttman, Jun 2, 2009
    #3
  4. A. Farber

    Uri Guttman Guest

    >>>>> "AF" == A Farber <> writes:

    AF> Here is how I've done it sofar, but I still wonder
    AF> if /^---/,/^---/ or alike is supported in Perl:


    AF> while (<$pipe>) {
    AF> $seen = !$seen if /^-----/;

    AF> $cods{lc $1} = lc $2 if $seen && /^(\w+)\s+(.+)$/;
    AF> }
    AF> close $pipe or die "Can't close $CMD: $!";

    you just reinvented the scalar range operator. see my other post.

    uri

    --
    Uri Guttman ------ -------- http://www.sysarch.com --
    ----- Perl Code Review , Architecture, Development, Training, Support ------
    --------- Free Perl Training --- http://perlhunter.com/college.html ---------
    --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
    Uri Guttman, Jun 2, 2009
    #4
  5. "A. Farber" <> writes:

    > Here is how I've done it sofar, but I still wonder
    > if /^---/,/^---/ or alike is supported in Perl:


    You want to look at the range operators, probably in the three dots
    version. look in 'perldoc perlop' for the 'Range operators' headline.

    //Makholm
    Peter Makholm, Jun 2, 2009
    #5
  6. A. Farber

    Guest

    On Jun 2, 8:41 am, "A. Farber" <> wrote:
    >
    > I'd like to extract few lines of input,
    > delimited by the "-----" lines, i.e.
    >
    > ---------------
    > blah
    > bleh
    > blue
    > ---------------
    >
    > Is there a nice way in Perl to do that,
    > maybe awk-like /^---/,/^---/ ?



    Use the "..." range operator, like this:

    perl -lne "print if /^---/ ... /^---/"

    This will print out the "---" lines, though. If you don't want to
    print those, you can exclude them with this:

    perl -lne "print if /^---/ ... /^---/ and not /^---/"

    but make sure that the "not /^---/" part comes AFTER the "/^---/ ... /
    ^---/" part, or else short-circuit evaluation will prevent the "/
    ^---/ ... /^---/" part from ever being evaluated, which will result in
    the one-liner not working as you intended.

    I hope this helps, A. Farber.

    -- Jean-Luc
    , Jun 2, 2009
    #6
  7. writes:

    > This will print out the "---" lines, though. If you don't want to
    > print those, you can exclude them with this:
    >
    > perl -lne "print if /^---/ ... /^---/ and not /^---/"
    >
    > but make sure that the "not /^---/" part comes AFTER the "/^---/ ... /
    > ^---/" part,


    And note that you have touse the low-precedence 'and' operator instead
    of the higher precedence && operator. Otherwise it will be parsed as
    part of the second portion of the range operator which will never be
    true.

    //Makholm
    Peter Makholm, Jun 2, 2009
    #7
  8. A. Farber

    Uri Guttman Guest

    >>>>> "jp" == jl post <> writes:

    jp> On Jun 2, 8:41 am, "A. Farber" <> wrote:

    jp> Use the "..." range operator, like this:

    you don't need the ... as .. will do fine. he can't match the left and
    right sides of .. on the same line as he has two different lines for
    delimiting.

    jp> perl -lne "print if /^---/ ... /^---/ and not /^---/"

    and if that is all he really needs, then he can just skip all --- lines
    with a simpler unix grep:

    foo | grep -v '----'

    OP: is there stuff to ignore between line groups delimited by ---? do
    you need to do anything other than print them (as in process them with
    more code)?

    uri

    --
    Uri Guttman ------ -------- http://www.sysarch.com --
    ----- Perl Code Review , Architecture, Development, Training, Support ------
    --------- Free Perl Training --- http://perlhunter.com/college.html ---------
    --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
    Uri Guttman, Jun 2, 2009
    #8
  9. A. Farber

    Guest

    > >>>>> "jp" == jl post <> writes:
    >
    > jp> On Jun 2, 8:41 am, "A. Farber" <> wrote:
    >
    > jp> Use the "..." range operator, like this:
    >
    > jp> perl -lne "print if /^---/ ... /^---/"


    On Jun 2, 1:10 pm, "Uri Guttman" <> replied:
    > you don't need the ... as .. will do fine. he can't match the left and
    > right sides of .. on the same line as he has two different lines for
    > delimiting.


    I'm not sure I follow you, Uri. According to the original post,
    the delimiting lines are identical. Therefore, if '..' is used, the
    range operator will "flip and flop" on the same line, never showing
    the lines between the delimiting lines.

    To make sure he captures the lines between the '---' lines, the
    original poster should use the '...' operator, since according to
    "perldoc perlop", the '...' operator won't test the right operand
    until the next operation (as in sed).

    To illsutrate, if I run the command:

    perl -E "say '---'; say foreach 10,11,12; say '---'"

    it gives me this output:

    ---
    10
    11
    12
    ---

    If I pipe it through the command with the '..' range operator, like
    this:

    perl -E "say '---'; say foreach 10,11,12; say '---'" | perl -lne
    "print if /^---/ .. /^---/"

    the output is:

    ---
    ---

    showing us that the range operator flipped and flopped on the same
    line, never printing anything but the "---" lines.

    But if we replace the '..' operator with '...', in this command:

    perl -E "say '---'; say foreach 10,11,12; say '---'" | perl -lne
    "print if /^---/ ... /^---/"

    we get this output:

    ---
    10
    11
    12
    ---

    So in this case, it does matter whether '..' or '...' is used.

    -- Jean-Luc
    , Jun 3, 2009
    #9
  10. A. Farber

    Uri Guttman Guest

    >>>>> "jp" == jl post <> writes:

    >> >>>>> "jp" == jl post <> writes:

    >>

    jp> On Jun 2, 8:41 am, "A. Farber" <> wrote:
    >>

    jp> Use the "..." range operator, like this:
    >>

    jp> perl -lne "print if /^---/ ... /^---/"

    jp> On Jun 2, 1:10 pm, "Uri Guttman" <> replied:
    >> you don't need the ... as .. will do fine. he can't match the left and
    >> right sides of .. on the same line as he has two different lines for
    >> delimiting.


    jp> I'm not sure I follow you, Uri. According to the original post,
    jp> the delimiting lines are identical. Therefore, if '..' is used, the
    jp> range operator will "flip and flop" on the same line, never showing
    jp> the lines between the delimiting lines.

    maybe i flip flopped on which op tests the same lines. i haven't used
    them in a good while as i use the slurp/extract technique when i need to
    do this.

    jp> So in this case, it does matter whether '..' or '...' is used.

    seems like it. the slurp/extract technique is still faster and better. i
    have used the flip flop many times in the distant past but not in a good
    while.

    uri

    --
    Uri Guttman ------ -------- http://www.sysarch.com --
    ----- Perl Code Review , Architecture, Development, Training, Support ------
    --------- Free Perl Training --- http://perlhunter.com/college.html ---------
    --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
    Uri Guttman, Jun 3, 2009
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. gorda
    Replies:
    2
    Views:
    543
    Andrew Shitov
    Oct 21, 2003
  2. Anand S Bisen

    how to extract columns like awk $1 $5

    Anand S Bisen, Jan 7, 2005, in forum: Python
    Replies:
    7
    Views:
    1,920
    Carl Banks
    Jan 8, 2005
  3. Craig Ringer

    Re: how to extract columns like awk $1 $5

    Craig Ringer, Jan 7, 2005, in forum: Python
    Replies:
    0
    Views:
    425
    Craig Ringer
    Jan 7, 2005
  4. morrell
    Replies:
    1
    Views:
    949
    roy axenov
    Oct 10, 2006
  5. Alvin
    Replies:
    8
    Views:
    974
Loading...

Share This Page