Program for retrieving certain lines in a file and writing them to another file

Discussion in 'Perl Misc' started by shahriar_saberi, Jul 21, 2005.

  1. Hi all,

    I am trying to write a program that scans a large log file and only
    extracts the lines that pertain to an error message and writes it to a
    different line. Basically the log file will be of this format :

    Junk Junk ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
    Error.............................................\n
    Junk Junk Junk~~~~~~~~~~~~~~~~~~~~~~~~~~~~````
    Error..............\n

    So between the junk text there will be these long Error strings that
    are terminated by a new line.

    Now I know how to open the actual file and also looked up the tgrep
    function but I don't know how to put it all together.

    I guess the algorithm would be to find
    -first instance of Error
    -read everything between Error and the newline character into a vaiable
    -open the target file and write the buffer into the target file

    If anyone can guide me with some key perl commands or guide me to the
    right direction I would really appreciate it.

    Thanks,
    Shah
     
    shahriar_saberi, Jul 21, 2005
    #1
    1. Advertisements

  2. shahriar_saberi

    Paul Lalli Guest

    It's usually a far better idea to give us samples of the *actual* data.
    I have no idea if your error messages can span multiple lines, and if
    so, is the indicator only present on the first line?

    If NOT, that is if the Error message will only be on one line, and you
    don't care about anything else, I'd do a one-liner:

    perl -n -e"print if /^Error/" input.txt > output.txt

    (Obviously, you'd want to replace /^Error/ with the actual condition).

    This, of course, is really just a perl-way of writing:

    grep ^Error input.txt > output.txt
    Never heard of the tgrep function. Assuming a typo.
    That's a pretty bad algorithm. You're storing all the data in memory
    until it comes time to write the data out to a file. Why not write the
    data as you receive it?
    If your desired lines may span more than one line, I'd say take
    advantage of the .. (flip-flop) operator:

    (untested)
    #!/usr/bin/perl
    use strict;
    use warnings;

    open my $in, '<', 'input.txt' or die "Cannot open input: $!";
    open my $out, '>', 'output.txt' or die "Cannot open output: $!";

    while (<$in>){
    if (/^Error/ .. /^Junk/){
    print $out $_;
    }
    }

    close $in;
    close $out;

    __END__

    Again, replace the pattern matches with the actual conditions that
    determine where your Error messages begin and end.

    Read more about the flip-flop operator in
    perldoc perlop

    Read about opening files in:
    perldoc -f open

    Read about regular expressions in:
    perldoc perlre
    perldoc perlretut
    perldoc perlrequick

    Hope this helps,
    Paul Lalli
     
    Paul Lalli, Jul 21, 2005
    #2
    1. Advertisements

  3. shahriar_saberi

    Debo Guest

    I don't know if this is a kosher suggestion, but you don't really need
    perl to do this if you're in *nix or cygwin (or your other favorite
    variant).

    grep "^Error" logfile | tr -d 'Error ' > output_file

    That's quick and dirty, and I'm assuming the word 'Error' only at the
    beginning of those error lines, but that's fixable without too much
    hassle.

    -Debo
     
    Debo, Jul 21, 2005
    #3
  4. It does sound as a trivial task, if you know just a little about Perl.

    http://learn.perl.org/
    Good. Please show us! Actually, please show us the code you've got so
    far, or else it will be difficult to help you get it right.

    These are the posting guidelines for this group:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
    What's that? Is it Perl?
     
    Gunnar Hjalmarsson, Jul 21, 2005
    #4
  5. Thank you Paul,

    That is great help.

     
    shahriar_saberi, Jul 21, 2005
    #5
  6. Debi,

    The reason I need this to be in perl is because I want another perl
    script program to call this procedure, that's all.

    But thank you for your suggestion.
     
    shahriar_saberi, Jul 21, 2005
    #6
  7. Paul,

    I have actually been trying your suggestion and I assumed there is a
    space between the -e and "print. But when I run it I keep getting the
    following error:

    Bareword found where operator expected at C:\perlstuff\sample.pl line
    print if /^Error/" input"
    (Missing operator before input?)
    syntax error at C:\perlstuff\sample.pl line 1, near "n -e"
    Execution of C:\perlstuff\sample.pl aborted due to compilation errors.

    Thanks in advance for any ideas.

    Shah
     
    shahriar_saberi, Jul 22, 2005
    #7
  8. shahriar_saberi

    Paul Lalli Guest

    First, please learn how to correctly reply. Post your comments below
    what you are replying to, and do not snip the attributions. Thank you.

    Second, I do not understand what you're doing. That command I gave
    should be issued on the command line. What is your sample.pl file?
    There should not be any .pl file involved.

    And no, there does not need to be a space between -e and "print.

    Paul Lalli
     
    Paul Lalli, Jul 22, 2005
    #8
  9. shahriar_saberi

    Joe Smith Guest

    Get rid of sample.pl and it will work.

    C:\perlstuff>perl -n -e"print if /^Error/" input.txt > output.txt
    C:\perlstuff>dir output.txt
    C:\perlstuff>perldoc perlrun

    -Joe
     
    Joe Smith, Jul 22, 2005
    #9
  10. shahriar_saberi

    Joe Smith Guest

    That's _not_ how you use 'tr'.

    echo Error: Everything is not all right | tr -d 'Error'
    : veything is nt all ight

    The solution needs 'sed -n' or a simple line of perl.

    unix% perl -ne 's/^Error: // and print' input.txt >output.txt
    C:\> perl -ne "s/^Error: // and print" input.txt >output.txt

    -Joe
     
    Joe Smith, Jul 22, 2005
    #10
  11. I am sorry , I hope this I am doing it correctly now :)
    Well I wasn't trying to run this from the command line, I just wanted
    to run this from a perl file which is going to do other things as well
    before getting to this procedure. I hope this clarifies things.

    Thanks,
    Shah
     
    shahriar_saberi, Jul 22, 2005
    #11
  12. Joe,

    I was trying to run this procedure from within my perl file which is
    called sample.pl. The idea is that there will be other operations as
    well before getting to the one outlined above.

    Thanks,
    Shah
     
    shahriar_saberi, Jul 22, 2005
    #12
  13. On 21 Jul 2005 13:42:00 -0700
    Here's what I'd do:
    As I see it, it's rather simple really, something that could easily be done with the regexp already mentioned before, i.e. something like /^Error\s*(.+)$/.
    This filters out each line with the string 'Error' at the beginning, followed by any number of (including zero) whitespaces followed by the the remainder of the line. You would then find the relevant text in $1.
    That's how it would look like in an actual program:

    #!/usr/bin/perl

    use strict;
    use warnings;

    open IN, "foo.txt" or die "Couldn't open foo.txt: $!";
    open OUT, "> bar.txt" or die "Couldn't open bar.txt: $!";

    while (<IN>) {
    print OUT $1 if /^Error\s*(.+)$/i; # match case insensitively
    }

    Of course this virtually screams for a one liner but if you need to call it from within another script that's what you might try.

    SveTho
     
    Sven-Thorsten Fahrbach, Jul 24, 2005
    #13
  14. shahriar_saberi

    Paul Lalli Guest

    Very much so. Thank you.
    Well then. You probably should have specified that originally....

    In that case, you won't be able to take advantage of the magic of the
    -n switch. You can look at perldoc perlrun to see exactly what -n does
    and put that into your script. Basically, you're going to have three
    steps:
    * open the input file for reading
    * open the output file for writing
    * loop through all lines of the input file, printing to the output file
    only those lines you want to keep

    The finished chunk of code will look something like this: (UNTESTED)

    open my $in, '<', 'input.txt' or die "Cannot open input: $!";
    open my $out '>', 'output.txt' or die "Cannot open output: $!";
    while (<$in>) {
    print $out $_ if /^Error/;
    }
    close $in;
    close $out;

    You can read more about all of these lines of code in:
    perldoc -f open
    perldoc -f readline
    perldoc -f print
    perldoc perlre

    Paul Lalli
     
    Paul Lalli, Jul 25, 2005
    #14
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.