Program for retrieving certain lines in a file and writing them to another file

Discussion in 'Perl Misc' started by shahriar_saberi@yahoo.com, Jul 21, 2005.

  1. Guest

    Hi all,

    I am trying to write a program that scans a large log file and only
    extracts the lines that pertain to an error message and writes it to a
    different line. Basically the log file will be of this format :

    Junk Junk ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
    Error.............................................\n
    Junk Junk Junk~~~~~~~~~~~~~~~~~~~~~~~~~~~~````
    Error..............\n

    So between the junk text there will be these long Error strings that
    are terminated by a new line.

    Now I know how to open the actual file and also looked up the tgrep
    function but I don't know how to put it all together.

    I guess the algorithm would be to find
    -first instance of Error
    -read everything between Error and the newline character into a vaiable
    -open the target file and write the buffer into the target file

    If anyone can guide me with some key perl commands or guide me to the
    right direction I would really appreciate it.

    Thanks,
    Shah
    , Jul 21, 2005
    #1
    1. Advertising

  2. Paul Lalli Guest

    wrote:
    > Hi all,
    >
    > I am trying to write a program that scans a large log file and only
    > extracts the lines that pertain to an error message and writes it to a
    > different line. Basically the log file will be of this format :
    >
    > Junk Junk ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
    > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
    > Error.............................................\n
    > Junk Junk Junk~~~~~~~~~~~~~~~~~~~~~~~~~~~~````
    > Error..............\n


    It's usually a far better idea to give us samples of the *actual* data.
    I have no idea if your error messages can span multiple lines, and if
    so, is the indicator only present on the first line?

    If NOT, that is if the Error message will only be on one line, and you
    don't care about anything else, I'd do a one-liner:

    perl -n -e"print if /^Error/" input.txt > output.txt

    (Obviously, you'd want to replace /^Error/ with the actual condition).

    This, of course, is really just a perl-way of writing:

    grep ^Error input.txt > output.txt

    > So between the junk text there will be these long Error strings that
    > are terminated by a new line.
    >
    > Now I know how to open the actual file and also looked up the tgrep
    > function but I don't know how to put it all together.


    Never heard of the tgrep function. Assuming a typo.

    > I guess the algorithm would be to find
    > -first instance of Error
    > -read everything between Error and the newline character into a vaiable
    > -open the target file and write the buffer into the target file


    That's a pretty bad algorithm. You're storing all the data in memory
    until it comes time to write the data out to a file. Why not write the
    data as you receive it?

    > If anyone can guide me with some key perl commands or guide me to the
    > right direction I would really appreciate it.


    If your desired lines may span more than one line, I'd say take
    advantage of the .. (flip-flop) operator:

    (untested)
    #!/usr/bin/perl
    use strict;
    use warnings;

    open my $in, '<', 'input.txt' or die "Cannot open input: $!";
    open my $out, '>', 'output.txt' or die "Cannot open output: $!";

    while (<$in>){
    if (/^Error/ .. /^Junk/){
    print $out $_;
    }
    }

    close $in;
    close $out;

    __END__

    Again, replace the pattern matches with the actual conditions that
    determine where your Error messages begin and end.

    Read more about the flip-flop operator in
    perldoc perlop

    Read about opening files in:
    perldoc -f open

    Read about regular expressions in:
    perldoc perlre
    perldoc perlretut
    perldoc perlrequick

    Hope this helps,
    Paul Lalli
    Paul Lalli, Jul 21, 2005
    #2
    1. Advertising

  3. Debo Guest

    Re: Program for retrieving certain lines in a file and writing themto another file (OT)

    On Thu, 21 Jul 2005 wrote:
    > Hi all,
    >
    > I am trying to write a program that scans a large log file and only
    > extracts the lines that pertain to an error message and writes it to a
    > different line. Basically the log file will be of this format :
    >
    > Junk Junk ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
    > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
    > Error.............................................\n
    > Junk Junk Junk~~~~~~~~~~~~~~~~~~~~~~~~~~~~````
    > Error..............\n


    I don't know if this is a kosher suggestion, but you don't really need
    perl to do this if you're in *nix or cygwin (or your other favorite
    variant).

    grep "^Error" logfile | tr -d 'Error ' > output_file

    That's quick and dirty, and I'm assuming the word 'Error' only at the
    beginning of those error lines, but that's fixable without too much
    hassle.

    -Debo
    Debo, Jul 21, 2005
    #3
  4. Re: Program for retrieving certain lines in a file and writing themto another file

    wrote:
    > I am trying to write a program that scans a large log file and only
    > extracts the lines that pertain to an error message and writes it to a
    > different line.


    It does sound as a trivial task, if you know just a little about Perl.

    http://learn.perl.org/

    > Now I know how to open the actual file


    Good. Please show us! Actually, please show us the code you've got so
    far, or else it will be difficult to help you get it right.

    These are the posting guidelines for this group:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html

    > and also looked up the tgrep function


    What's that? Is it Perl?

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Jul 21, 2005
    #4
  5. Guest

    Thank you Paul,

    That is great help.

    Paul Lalli wrote:
    > wrote:
    > > Hi all,
    > >
    > > I am trying to write a program that scans a large log file and only
    > > extracts the lines that pertain to an error message and writes it to a
    > > different line. Basically the log file will be of this format :
    > >
    > > Junk Junk ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
    > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
    > > Error.............................................\n
    > > Junk Junk Junk~~~~~~~~~~~~~~~~~~~~~~~~~~~~````
    > > Error..............\n

    >
    > It's usually a far better idea to give us samples of the *actual* data.
    > I have no idea if your error messages can span multiple lines, and if
    > so, is the indicator only present on the first line?
    >
    > If NOT, that is if the Error message will only be on one line, and you
    > don't care about anything else, I'd do a one-liner:
    >
    > perl -n -e"print if /^Error/" input.txt > output.txt
    >
    > (Obviously, you'd want to replace /^Error/ with the actual condition).
    >
    > This, of course, is really just a perl-way of writing:
    >
    > grep ^Error input.txt > output.txt
    >
    > > So between the junk text there will be these long Error strings that
    > > are terminated by a new line.
    > >
    > > Now I know how to open the actual file and also looked up the tgrep
    > > function but I don't know how to put it all together.

    >
    > Never heard of the tgrep function. Assuming a typo.
    >
    > > I guess the algorithm would be to find
    > > -first instance of Error
    > > -read everything between Error and the newline character into a vaiable
    > > -open the target file and write the buffer into the target file

    >
    > That's a pretty bad algorithm. You're storing all the data in memory
    > until it comes time to write the data out to a file. Why not write the
    > data as you receive it?
    >
    > > If anyone can guide me with some key perl commands or guide me to the
    > > right direction I would really appreciate it.

    >
    > If your desired lines may span more than one line, I'd say take
    > advantage of the .. (flip-flop) operator:
    >
    > (untested)
    > #!/usr/bin/perl
    > use strict;
    > use warnings;
    >
    > open my $in, '<', 'input.txt' or die "Cannot open input: $!";
    > open my $out, '>', 'output.txt' or die "Cannot open output: $!";
    >
    > while (<$in>){
    > if (/^Error/ .. /^Junk/){
    > print $out $_;
    > }
    > }
    >
    > close $in;
    > close $out;
    >
    > __END__
    >
    > Again, replace the pattern matches with the actual conditions that
    > determine where your Error messages begin and end.
    >
    > Read more about the flip-flop operator in
    > perldoc perlop
    >
    > Read about opening files in:
    > perldoc -f open
    >
    > Read about regular expressions in:
    > perldoc perlre
    > perldoc perlretut
    > perldoc perlrequick
    >
    > Hope this helps,
    > Paul Lalli
    , Jul 21, 2005
    #5
  6. Guest

    Re: Program for retrieving certain lines in a file and writing them to another file (OT)

    Debi,

    The reason I need this to be in perl is because I want another perl
    script program to call this procedure, that's all.

    But thank you for your suggestion.


    > I don't know if this is a kosher suggestion, but you don't really need
    > perl to do this if you're in *nix or cygwin (or your other favorite
    > variant).
    >
    > grep "^Error" logfile | tr -d 'Error ' > output_file
    >
    > That's quick and dirty, and I'm assuming the word 'Error' only at the
    > beginning of those error lines, but that's fixable without too much
    > hassle.
    >
    > -Debo
    , Jul 21, 2005
    #6
  7. Guest

    Paul,

    I have actually been trying your suggestion and I assumed there is a
    space between the -e and "print. But when I run it I keep getting the
    following error:

    Bareword found where operator expected at C:\perlstuff\sample.pl line
    print if /^Error/" input"
    (Missing operator before input?)
    syntax error at C:\perlstuff\sample.pl line 1, near "n -e"
    Execution of C:\perlstuff\sample.pl aborted due to compilation errors.

    Thanks in advance for any ideas.

    Shah

    > If NOT, that is if the Error message will only be on one line, and you
    > don't care about anything else, I'd do a one-liner:
    >
    > perl -n -e"print if /^Error/" input.txt > output.txt
    >
    , Jul 22, 2005
    #7
  8. Paul Lalli Guest

    wrote:
    > I wrote:
    > > If NOT, that is if the Error message will only be on one line, and you
    > > don't care about anything else, I'd do a one-liner:
    > >
    > > perl -n -e"print if /^Error/" input.txt > output.txt

    >
    > I have actually been trying your suggestion and I assumed there is a
    > space between the -e and "print. But when I run it I keep getting the
    > following error:
    >
    > Bareword found where operator expected at C:\perlstuff\sample.pl line
    > print if /^Error/" input"
    > (Missing operator before input?)
    > syntax error at C:\perlstuff\sample.pl line 1, near "n -e"
    > Execution of C:\perlstuff\sample.pl aborted due to compilation errors.


    First, please learn how to correctly reply. Post your comments below
    what you are replying to, and do not snip the attributions. Thank you.

    Second, I do not understand what you're doing. That command I gave
    should be issued on the command line. What is your sample.pl file?
    There should not be any .pl file involved.

    And no, there does not need to be a space between -e and "print.

    Paul Lalli
    Paul Lalli, Jul 22, 2005
    #8
  9. Joe Smith Guest

    Re: Program for retrieving certain lines in a file and writing themto another file

    wrote:
    > Paul,
    >
    > I have actually been trying your suggestion and I assumed there is a
    > space between the -e and "print. But when I run it I keep getting the
    > following error:
    >
    > Bareword found where operator expected at C:\perlstuff\sample.pl line
    > print if /^Error/" input"
    > (Missing operator before input?)
    > syntax error at C:\perlstuff\sample.pl line 1, near "n -e"
    > Execution of C:\perlstuff\sample.pl aborted due to compilation errors.


    Get rid of sample.pl and it will work.

    C:\perlstuff>perl -n -e"print if /^Error/" input.txt > output.txt
    C:\perlstuff>dir output.txt
    C:\perlstuff>perldoc perlrun

    -Joe
    Joe Smith, Jul 22, 2005
    #9
  10. Joe Smith Guest

    Re: Program for retrieving certain lines in a file and writing themto another file (OT)

    Debo wrote:

    > grep "^Error" logfile | tr -d 'Error ' > output_file
    >
    > That's quick and dirty, and I'm assuming the word 'Error' only at the
    > beginning of those error lines, but that's fixable without too much
    > hassle.


    That's _not_ how you use 'tr'.

    echo Error: Everything is not all right | tr -d 'Error'
    : veything is nt all ight

    The solution needs 'sed -n' or a simple line of perl.

    unix% perl -ne 's/^Error: // and print' input.txt >output.txt
    C:\> perl -ne "s/^Error: // and print" input.txt >output.txt

    -Joe
    Joe Smith, Jul 22, 2005
    #10
  11. Guest

    Paul Lalli wrote:
    > wrote:
    > > I wrote:
    > > > If NOT, that is if the Error message will only be on one line, and you
    > > > don't care about anything else, I'd do a one-liner:
    > > >
    > > > perl -n -e"print if /^Error/" input.txt > output.txt

    > >
    > > I have actually been trying your suggestion and I assumed there is a
    > > space between the -e and "print. But when I run it I keep getting the
    > > following error:
    > >
    > > Bareword found where operator expected at C:\perlstuff\sample.pl line
    > > print if /^Error/" input"
    > > (Missing operator before input?)
    > > syntax error at C:\perlstuff\sample.pl line 1, near "n -e"
    > > Execution of C:\perlstuff\sample.pl aborted due to compilation errors.

    >
    > First, please learn how to correctly reply. Post your comments below
    > what you are replying to, and do not snip the attributions. Thank you.


    I am sorry , I hope this I am doing it correctly now :)
    >
    > Second, I do not understand what you're doing. That command I gave
    > should be issued on the command line. What is your sample.pl file?
    > There should not be any .pl file involved.


    Well I wasn't trying to run this from the command line, I just wanted
    to run this from a perl file which is going to do other things as well
    before getting to this procedure. I hope this clarifies things.

    Thanks,
    Shah
    , Jul 22, 2005
    #11
  12. Guest

    Joe Smith wrote:
    > wrote:
    > > Paul,
    > >
    > > I have actually been trying your suggestion and I assumed there is a
    > > space between the -e and "print. But when I run it I keep getting the
    > > following error:
    > >
    > > Bareword found where operator expected at C:\perlstuff\sample.pl line
    > > print if /^Error/" input"
    > > (Missing operator before input?)
    > > syntax error at C:\perlstuff\sample.pl line 1, near "n -e"
    > > Execution of C:\perlstuff\sample.pl aborted due to compilation errors.

    >
    > Get rid of sample.pl and it will work.
    >
    > C:\perlstuff>perl -n -e"print if /^Error/" input.txt > output.txt
    > C:\perlstuff>dir output.txt
    > C:\perlstuff>perldoc perlrun


    Joe,

    I was trying to run this procedure from within my perl file which is
    called sample.pl. The idea is that there will be other operations as
    well before getting to the one outlined above.

    Thanks,
    Shah
    , Jul 22, 2005
    #12
  13. Re: Program for retrieving certain lines in a file and writing themto another file (OT)

    On 21 Jul 2005 13:42:00 -0700
    wrote:

    > Debi,
    >
    > The reason I need this to be in perl is because I want another perl
    > script program to call this procedure, that's all.


    Here's what I'd do:
    As I see it, it's rather simple really, something that could easily be done with the regexp already mentioned before, i.e. something like /^Error\s*(.+)$/.
    This filters out each line with the string 'Error' at the beginning, followed by any number of (including zero) whitespaces followed by the the remainder of the line. You would then find the relevant text in $1.
    That's how it would look like in an actual program:

    #!/usr/bin/perl

    use strict;
    use warnings;

    open IN, "foo.txt" or die "Couldn't open foo.txt: $!";
    open OUT, "> bar.txt" or die "Couldn't open bar.txt: $!";

    while (<IN>) {
    print OUT $1 if /^Error\s*(.+)$/i; # match case insensitively
    }

    Of course this virtually screams for a one liner but if you need to call it from within another script that's what you might try.

    SveTho
    Sven-Thorsten Fahrbach, Jul 24, 2005
    #13
  14. Paul Lalli Guest

    wrote:
    > Paul Lalli wrote:
    > > First, please learn how to correctly reply. Post your comments below
    > > what you are replying to, and do not snip the attributions. Thank you.

    >
    > I am sorry , I hope this I am doing it correctly now :)


    Very much so. Thank you.

    > > Second, I do not understand what you're doing. That command I gave
    > > should be issued on the command line. What is your sample.pl file?
    > > There should not be any .pl file involved.

    >
    > Well I wasn't trying to run this from the command line, I just wanted
    > to run this from a perl file which is going to do other things as well
    > before getting to this procedure. I hope this clarifies things.


    Well then. You probably should have specified that originally....

    In that case, you won't be able to take advantage of the magic of the
    -n switch. You can look at perldoc perlrun to see exactly what -n does
    and put that into your script. Basically, you're going to have three
    steps:
    * open the input file for reading
    * open the output file for writing
    * loop through all lines of the input file, printing to the output file
    only those lines you want to keep

    The finished chunk of code will look something like this: (UNTESTED)

    open my $in, '<', 'input.txt' or die "Cannot open input: $!";
    open my $out '>', 'output.txt' or die "Cannot open output: $!";
    while (<$in>) {
    print $out $_ if /^Error/;
    }
    close $in;
    close $out;

    You can read more about all of these lines of code in:
    perldoc -f open
    perldoc -f readline
    perldoc -f print
    perldoc perlre

    Paul Lalli
    Paul Lalli, Jul 25, 2005
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Anonieko

    HttpHandlers - Learn Them. Use Them.

    Anonieko, Jun 15, 2006, in forum: ASP .Net
    Replies:
    5
    Views:
    516
    tdavisjr
    Jun 16, 2006
  2. Murali
    Replies:
    2
    Views:
    559
    Jerry Coffin
    Mar 9, 2006
  3. Jari Dolberg

    Retrieving data from database and mail them

    Jari Dolberg, Sep 8, 2004, in forum: Javascript
    Replies:
    3
    Views:
    88
    Joakim Braun
    Sep 8, 2004
  4. SAN CAZIANO
    Replies:
    8
    Views:
    174
    Dr John Stockton
    Oct 15, 2004
  5. cowboy2474
    Replies:
    0
    Views:
    290
    cowboy2474
    Oct 2, 2013
Loading...

Share This Page