grepping lines above and below a pattern found

Discussion in 'Perl Misc' started by mike, Apr 21, 2004.

  1. mike

    mike Guest

    hi

    say i have this in a text file

    "Use this form to
    post your messages.
    Remember that
    it can be viewed by
    millions of people worldwide.
    For questions about posting,
    check our FAQ"

    what is the most commonly used way to print ,say , 2 lines above and 2
    lines below a certain pattern found such that my output is

    "post your messages.
    Remember that
    it can be viewed by
    millions of people worldwide.
    For questions about posting, "

    for the matched pattern "viewed" .

    thanks...
     
    mike, Apr 21, 2004
    #1
    1. Advertising

  2. mike wrote:
    > say i have this in a text file
    >
    > "Use this form to
    > post your messages.
    > Remember that
    > it can be viewed by
    > millions of people worldwide.
    > For questions about posting,
    > check our FAQ"
    >
    > what is the most commonly used way to print ,say , 2 lines above
    > and 2 lines below a certain pattern found


    Don't know which way is most common. If the file isn't too big, an
    easy approach is to slurp all the lines into an array. Otherwise you
    can read the file line by line in a while loop, and keep the two
    latest lines saved temporarily.

    Which ways have you considered?

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Apr 21, 2004
    #2
    1. Advertising

  3. mike

    Paul Lalli Guest

    On Wed, 21 Apr 2004, mike wrote:

    > hi
    >
    > say i have this in a text file
    >
    > "Use this form to
    > post your messages.
    > Remember that
    > it can be viewed by
    > millions of people worldwide.
    > For questions about posting,
    > check our FAQ"
    >
    > what is the most commonly used way to print ,say , 2 lines above and 2
    > lines below a certain pattern found such that my output is
    >
    > "post your messages.
    > Remember that
    > it can be viewed by
    > millions of people worldwide.
    > For questions about posting, "
    >
    > for the matched pattern "viewed" .
    >
    > thanks...



    Assuming the file is not so huge that putting it all in memory at once
    will hurt you....


    #!/usr/bin/perl
    use strict;
    use warnings;

    {
    local $/;
    open FILE, 'file.txt' or die "Cannot open file: $!";
    my $contents = <FILE>;
    my $word = 'viewed';
    my @matches = $contents =~ /((?:.*\n){2}.*\b$word\b.*\n(?:.*\n){2})/g;
    foreach (@matches){
    print "$_\n";
    }
    }


    Paul Lalli
     
    Paul Lalli, Apr 21, 2004
    #3
  4. mike

    Abhinav Guest

    mike wrote:

    > hi
    >
    > say i have this in a text file
    >
    > "Use this form to
    > post your messages.
    > Remember that
    > it can be viewed by
    > millions of people worldwide.
    > For questions about posting,
    > check our FAQ"
    >
    > what is the most commonly used way to print ,say , 2 lines above and 2
    > lines below a certain pattern found such that my output is
    >
    > "post your messages.
    > Remember that
    > it can be viewed by
    > millions of people worldwide.
    > For questions about posting, "
    >
    > for the matched pattern "viewed" .
    >
    > thanks...

    Is this allowed ?

    system "grep -A2 -B2 $pattern $file";

    Thanks
    Abhinav
     
    Abhinav, Apr 21, 2004
    #4
  5. Abhinav <> wrote:

    > mike wrote:
    >
    >> say i have this in a text file
    >>
    >> "Use this form to
    >> post your messages.
    >> Remember that
    >> it can be viewed by
    >> millions of people worldwide.
    >> For questions about posting,
    >> check our FAQ"
    >>
    >> what is the most commonly used way to print ,say , 2 lines above
    >> and 2 lines below a certain pattern found such that my output is
    >>
    >> "post your messages.
    >> Remember that
    >> it can be viewed by
    >> millions of people worldwide.
    >> For questions about posting, "
    >>
    >> for the matched pattern "viewed" .
    >>
    >> thanks...

    > Is this allowed ?
    >
    > system "grep -A2 -B2 $pattern $file";


    Might not work well on an OS without grep. :)

    I'm tired of the question, anyway. How's this for a fairly general
    solution? The interface could probably be improved -- and I suppose
    it would be easier to just use Tie::File and loop through the lines.



    use strict;
    use warnings;

    print join "...\n",
    lines_around(
    qr/\bviewed\b/i,
    *DATA,
    {before => 2, after => 2}
    );


    sub lines_around {
    # pattern in filehandle with options...
    my ($pattern, $filehandle, $options) = @_;

    $options->{before} = defined $options->{before}
    ? $options->{before}
    : 1;
    $options->{after} = defined $options->{after}
    ? $options->{after}
    : 1;

    # return matching line, default 1
    $options->{match} = defined $options->{match}
    ? $options->{match}
    : 1;

    # return first match only, default 0
    $options->{first_only} = defined $options->{first_only}
    ? $options->{first_only}
    : 0;

    my (@buffer, @matches);
    while (my $line = <$filehandle>) {
    shift @buffer if @buffer > $options->{before};
    push @buffer, $line;
    next unless $line =~ /$pattern/;
    pop @buffer unless $options->{match};
    my @matched_lines;
    push @matched_lines, @buffer;
    push @matched_lines, scalar <DATA>
    for 1 .. $options->{after};
    return join('', @matched_lines) if $options->{first_only};
    push @matches, join '', @matched_lines;
    @buffer = ();
    }
    return @matches;
    }



    __DATA__
    Use this form to
    post your messages.
    Remember that
    it can be VIEWED by
    millions of people worldwide.
    For questions about posting,
    check our FAQ

    Use this form to
    post your messages.
    Remember that
    it can be VIEWED by
    millions of people worldwide.
    For questions about posting,
    check our FAQ
     
    David K. Wall, Apr 21, 2004
    #5
  6. mike <> wrote:

    > say i have this in a text file
    >
    > "Use this form to
    > post your messages.
    > Remember that
    > it can be viewed by
    > millions of people worldwide.
    > For questions about posting,
    > check our FAQ"
    >
    > what is the most commonly used way to print ,say , 2 lines above and 2
    > lines below a certain pattern found such that my output is
    >
    > "post your messages.
    > Remember that
    > it can be viewed by
    > millions of people worldwide.
    > For questions about posting, "
    >
    > for the matched pattern "viewed" .



    ------------------------------
    #!/usr/bin/perl
    use warnings;
    use strict;
    use Tie::File;

    my $filename = 'text.file';

    tie my @array, 'Tie::File', $filename, autochomp => 0 or
    die "could not tie file $!";

    foreach my $i ( 0 .. $#array ) {
    print @array[ $i-2 .. $i+2 ] if $array[$i] =~ /viewed/;
    }

    untie @array;
    ------------------------------


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Apr 21, 2004
    #6
  7. mike wrote:
    >
    > say i have this in a text file
    >
    > "Use this form to
    > post your messages.
    > Remember that
    > it can be viewed by
    > millions of people worldwide.
    > For questions about posting,
    > check our FAQ"
    >
    > what is the most commonly used way to print ,say , 2 lines above and 2
    > lines below a certain pattern found such that my output is
    >
    > "post your messages.
    > Remember that
    > it can be viewed by
    > millions of people worldwide.
    > For questions about posting, "
    >
    > for the matched pattern "viewed" .


    my $above = 2;
    my $below = 2;

    my @buffer;

    while ( <> ) {
    @buffer = ( @buffer, $_ )[ -( $above + 1 ) .. -1 ];
    /viewed/ && print @buffer, map scalar <>, 1 .. $below;
    }



    John
    --
    use Perl;
    program
    fulfillment
     
    John W. Krahn, Apr 21, 2004
    #7
  8. mike

    Brad Baxter Guest

    On Wed, 21 Apr 2004, mike wrote:

    > hi
    >
    > say i have this in a text file
    >
    > "Use this form to
    > post your messages.
    > Remember that
    > it can be viewed by
    > millions of people worldwide.
    > For questions about posting,
    > check our FAQ"
    >
    > what is the most commonly used way to print ,say , 2 lines above and 2
    > lines below a certain pattern found such that my output is
    >
    > "post your messages.
    > Remember that
    > it can be viewed by
    > millions of people worldwide.
    > For questions about posting, "
    >
    > for the matched pattern "viewed" .
    >
    > thanks...
    >


    perl
    -ne'/viewed/&&($x=$.);push@a,$_;$.>5&&shift@a;$x&&$.-$x>1&&last}{$x&&print@a'
    file

    Yes, it diverges from spec 2 lines from the end, but I didn't think
    anybody would care. :)

    Brad
     
    Brad Baxter, Apr 22, 2004
    #8
  9. mike

    Clyde Ingram Guest

    "Tad McClellan" <> wrote in message
    news:...
    > mike <> wrote:
    >
    > > what is the most commonly used way to print ,say , 2 lines above and 2
    > > lines below a certain pattern found such that my output is


    > tie my @array, 'Tie::File', $filename, autochomp => 0 or


    Nice .. didn't know about this.
    The OP might have some use for my inelegant meanderings below, on this
    theme.
    (Kindly forgive the overlong posting. I expect others have far more elegant
    solutions too.)

    Trying a "grep" on own source for pattern "[Tt]i." (running on Windoze XPee)
    with:

    trial_cgrep.pl [Tt]i. 2 trial_cgrep.pl

    shows (with "+" marking matched records):

    (2) :use warnings;
    (3) :use strict;
    (4)+:use Tie::File;
    (5) :use English;
    (6) :
    (24) :chomp @file_list;
    (25) :
    (26)+:die "$nr_lines: nr_lines should be zero or positive integer"
    (27) : unless ($nr_lines =~ /^\d+$/);
    (28) :
    (36) :for my $filename (@file_list) {
    (37) :
    (38)+: tie my @array, 'Tie::File', $filename, autochomp => 0 or
    (39)+: die "could not tie file $!";
    (40) :
    (41) : # Build a look-up table of record numbers already printed, so that
    we
    (72) : }
    (73) :
    (74)+: untie @array;
    (75) :}
    (76) :

    FWIW, the source is below.

    Regards,
    Clyde

    #!/usr/bin/perl
    use warnings;
    use strict;
    use Tie::File;
    use English;

    my $usage = "$PROGRAM_NAME pattern nr_lines file1 file2 file3 ...";
    my $help = "
    pattern: Search pattern.
    nr_lines: Number of records to show, before and after matching
    records.
    file1 ...: List of files to search.
    ";

    my $MIN_NR_ARGS_EXPECTED = 3;
    my $nr_args_supplied = 0 + @ARGV;

    die "Expected at least $MIN_NR_ARGS_EXPECTED arguments. " .
    "Got only $nr_args_supplied.\n" .
    "Usage: $usage\n$help" unless ($nr_args_supplied >=
    $MIN_NR_ARGS_EXPECTED);

    my $pattern = shift;
    my $nr_lines = shift;
    my @file_list = @ARGV;
    chomp @file_list;

    die "$nr_lines: nr_lines should be zero or positive integer"
    unless ($nr_lines =~ /^\d+$/);

    foreach (@file_list) {

    die "$_: No such readable file"
    unless (-r $_);
    }


    for my $filename (@file_list) {

    tie my @array, 'Tie::File', $filename, autochomp => 0 or
    die "could not tie file $!";

    # Build a look-up table of record numbers already printed, so that we
    # do not print the same record twice
    my %printed_nr = ();

    foreach my $i ( 0 .. $#array ) {
    if ($array[$i] =~ /$pattern/) {

    for my $j ($i-$nr_lines .. $i+$nr_lines) {

    next if $j < 0; # Avoid falling off start of array
    next if $j > $#array; # .. and end of array
    next if $printed_nr{$j}; # Skip record if already printed

    # Prepare print prefix for record being printed
    my @prefix_parts = ();

    # Omit the filename if only one supplied.
    push (@prefix_parts, "$filename ") if (@file_list > 1);

    # Record number. e.g. (42)
    push (@prefix_parts, "(" . ($j+1) . ")");

    # "+" indicator, if this record matches the pattern, else
    SPACE
    push (@prefix_parts, (($array[$j] =~ $pattern) ? "+" : "
    " ));

    my $prefix = join("", @prefix_parts) . ":";

    print $prefix . $array[$j];
    $printed_nr{$j} = 1;
    }
    }
    }

    untie @array;
    }


    ENDS
     
    Clyde Ingram, Apr 22, 2004
    #9
  10. mike

    Anno Siegel Guest

    Tad McClellan <> wrote in comp.lang.perl.misc:
    > mike <> wrote:
    >
    > > say i have this in a text file
    > >
    > > "Use this form to
    > > post your messages.
    > > Remember that
    > > it can be viewed by
    > > millions of people worldwide.
    > > For questions about posting,
    > > check our FAQ"
    > >
    > > what is the most commonly used way to print ,say , 2 lines above and 2
    > > lines below a certain pattern found such that my output is
    > >
    > > "post your messages.
    > > Remember that
    > > it can be viewed by
    > > millions of people worldwide.
    > > For questions about posting, "
    > >
    > > for the matched pattern "viewed" .

    >
    >
    > ------------------------------
    > #!/usr/bin/perl
    > use warnings;
    > use strict;
    > use Tie::File;
    >
    > my $filename = 'text.file';
    >
    > tie my @array, 'Tie::File', $filename, autochomp => 0 or
    > die "could not tie file $!";
    >
    > foreach my $i ( 0 .. $#array ) {
    > print @array[ $i-2 .. $i+2 ] if $array[$i] =~ /viewed/;
    > }
    >
    > untie @array;


    That's a good way to put Tie::File to use. It doesn't address two
    problems that come into view when the basic functionality is there:

    - Dealing with matches that are too close to the beginning or end of
    the file to fill a whole window

    - What to do with overlapping windows

    The OP hasn't addressed these issues, and I don't want to suggest
    solutions (some have been suggested in this thread). I just want
    to point out that this solution, pretty as it is, isn't the whole
    story.

    Anno
     
    Anno Siegel, Apr 22, 2004
    #10
  11. Clyde Ingram <> wrote:
    >
    > "Tad McClellan" <> wrote in message
    > news:...
    >> mike <> wrote:
    >>
    >> > what is the most commonly used way to print ,say , 2 lines above and 2
    >> > lines below a certain pattern found such that my output is

    >
    >> tie my @array, 'Tie::File', $filename, autochomp => 0 or

    >
    > Nice .. didn't know about this.



    I knew about it because I re-read the Perl FAQs from time to time. :)

    How do I change one line in a file/
    delete a line in a file/
    insert a line in the middle of a file/
    append to the beginning of a file?


    > my $nr_args_supplied = 0 + @ARGV;

    ^^^
    ^^^

    The LHS already ensures scalar context for @ARGV, so you don't need that.


    > my $pattern = shift;
    > my $nr_lines = shift;
    > my @file_list = @ARGV;


    my( $pattern, $nr_lines, @file_list ) = @ARGV;


    > chomp @file_list;



    You are expecting command line arguments with newlines in them?


    > die "$_: No such readable file"
    > unless (-r $_);



    That should probably be:

    die "$_: No such readable file"
    unless (-f $_ and -r _);


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Apr 22, 2004
    #11
  12. mike

    Brad Baxter Guest

    On Wed, 21 Apr 2004, Brad Baxter wrote:
    >
    > perl
    > -ne'/viewed/&&($x=$.);push@a,$_;$.>5&&shift@a;$x&&$.-$x>1&&last}{$x&&print@a'
    > file
    >
    > Yes, it diverges from spec 2 lines from the end, but I didn't think
    > anybody would care. :)


    It also has a problem with overlapping windows.

    perl
    -ne'/viewed/&&\!$x&&($x=$.);push@a,$_;$.>5&&shift@a;$x&&$.-$x>1&&last}{$x&&print@a'
    file

    (\! to hide !$ from the shell ... YMMV)

    Brad
     
    Brad Baxter, Apr 22, 2004
    #12
  13. mike

    Clyde Ingram Guest

    Tad,

    > I knew about it because I re-read the Perl FAQs from time to time. :)
    >
    > > my $nr_args_supplied = 0 + @ARGV;

    > ^^^
    > The LHS already ensures scalar context for @ARGV, so you don't need that.

    Ah, quite right.

    > > my $pattern = shift;
    > > my $nr_lines = shift;
    > > my @file_list = @ARGV;

    >
    > my( $pattern, $nr_lines, @file_list ) = @ARGV;

    Oh yes . . .

    > > chomp @file_list;

    >
    > You are expecting command line arguments with newlines in them?

    Er. . ., no

    > > die "$_: No such readable file"
    > > unless (-r $_);

    >
    > That should probably be:
    >
    > die "$_: No such readable file"
    > unless (-f $_ and -r _);

    Um, certainly. . .

    Thanks for the improvements, Tad.
    That was just a quick knock up script. (I did say it was inelegant.)
    Always nice to be shot down so constructively.

    Regards,
    Clyde
     
    Clyde Ingram, Apr 22, 2004
    #13
  14. mike

    David Combs Guest

    In article <Xns94D297D2B5106dkwwashere@216.168.3.30>,
    David K. Wall <> wrote:
    >Abhinav <> wrote:
    >
    >> mike wrote:
    >>
    >>> say i have this in a text file
    >>>
    >>> "Use this form to
    >>> post your messages.
    >>> Remember that
    >>> it can be viewed by
    >>> millions of people worldwide.
    >>> For questions about posting,
    >>> check our FAQ"
    >>>
    >>> what is the most commonly used way to print ,say , 2 lines above
    >>> and 2 lines below a certain pattern found such that my output is
    >>>
    >>> "post your messages.
    >>> Remember that
    >>> it can be viewed by
    >>> millions of people worldwide.
    >>> For questions about posting, "
    >>>
    >>> for the matched pattern "viewed" .
    >>>
    >>> thanks...

    >> Is this allowed ?
    >>
    >> system "grep -A2 -B2 $pattern $file";

    >
    >Might not work well on an OS without grep. :)


    Might not work well on an OS without GNU grep. :)

    David
     
    David Combs, May 21, 2004
    #14
  15. mike

    David Combs Guest

    In article <>, John W. Krahn <> wrote:
    >mike wrote:
    >>
    >> say i have this in a text file
    >>
    >> "Use this form to
    >> post your messages.
    >> Remember that
    >> it can be viewed by
    >> millions of people worldwide.
    >> For questions about posting,
    >> check our FAQ"
    >>
    >> what is the most commonly used way to print ,say , 2 lines above and 2
    >> lines below a certain pattern found such that my output is
    >>
    >> "post your messages.
    >> Remember that
    >> it can be viewed by
    >> millions of people worldwide.
    >> For questions about posting, "
    >>
    >> for the matched pattern "viewed" .

    >
    >my $above = 2;
    >my $below = 2;
    >
    >my @buffer;
    >
    >while ( <> ) {
    > @buffer = ( @buffer, $_ )[ -( $above + 1 ) .. -1 ];
    > /viewed/ && print @buffer, map scalar <>, 1 .. $below;
    > }
    >


    John: Perhaps you could add a few comments explaining
    your (as usual) nifty code?

    Once you've done that, maybe it can be inserted
    in to some tutorial part of the pod-files --
    looks like it'd be useful, and educational, for
    lots of people.

    Thanks,

    David
     
    David Combs, May 21, 2004
    #15
  16. mike

    Anno Siegel Guest

    John W. Krahn <> wrote in comp.lang.perl.misc:

    [...]

    > my $above = 2;
    > my $below = 2;
    >
    > my @buffer;
    >
    > while ( <> ) {
    > @buffer = ( @buffer, $_ )[ -( $above + 1 ) .. -1 ];
    > /viewed/ && print @buffer, map scalar <>, 1 .. $below;


    /viewed/ && print @buffer,
    grep defined, map scalar <DATA>, 1 .. $below;

    ....in case of matches near the end of the file.

    > }


    Anno
     
    Anno Siegel, May 21, 2004
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Bryan Krone

    perl efficiency -- fastest grepping?

    Bryan Krone, Nov 5, 2004, in forum: Perl
    Replies:
    1
    Views:
    1,504
    Jim Gibson
    Nov 8, 2004
  2. David A. Black
    Replies:
    3
    Views:
    133
    Florian Gilcher
    Jan 19, 2009
  3. Replies:
    13
    Views:
    291
  4. kiran
    Replies:
    12
    Views:
    1,216
    Scott Sauyet
    Dec 7, 2011
  5. kquirici
    Replies:
    13
    Views:
    1,042
    Rainer Weikusat
    Jun 19, 2012
Loading...

Share This Page