Program for retrieving certain lines in a file and writing them to another file

S

shahriar_saberi

Hi all,

I am trying to write a program that scans a large log file and only
extracts the lines that pertain to an error message and writes it to a
different line. Basically the log file will be of this format :

Junk Junk ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
Error.............................................\n
Junk Junk Junk~~~~~~~~~~~~~~~~~~~~~~~~~~~~````
Error..............\n

So between the junk text there will be these long Error strings that
are terminated by a new line.

Now I know how to open the actual file and also looked up the tgrep
function but I don't know how to put it all together.

I guess the algorithm would be to find
-first instance of Error
-read everything between Error and the newline character into a vaiable
-open the target file and write the buffer into the target file

If anyone can guide me with some key perl commands or guide me to the
right direction I would really appreciate it.

Thanks,
Shah
 
P

Paul Lalli

Hi all,

I am trying to write a program that scans a large log file and only
extracts the lines that pertain to an error message and writes it to a
different line. Basically the log file will be of this format :

Junk Junk ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
Error.............................................\n
Junk Junk Junk~~~~~~~~~~~~~~~~~~~~~~~~~~~~````
Error..............\n

It's usually a far better idea to give us samples of the *actual* data.
I have no idea if your error messages can span multiple lines, and if
so, is the indicator only present on the first line?

If NOT, that is if the Error message will only be on one line, and you
don't care about anything else, I'd do a one-liner:

perl -n -e"print if /^Error/" input.txt > output.txt

(Obviously, you'd want to replace /^Error/ with the actual condition).

This, of course, is really just a perl-way of writing:

grep ^Error input.txt > output.txt
So between the junk text there will be these long Error strings that
are terminated by a new line.

Now I know how to open the actual file and also looked up the tgrep
function but I don't know how to put it all together.

Never heard of the tgrep function. Assuming a typo.
I guess the algorithm would be to find
-first instance of Error
-read everything between Error and the newline character into a vaiable
-open the target file and write the buffer into the target file

That's a pretty bad algorithm. You're storing all the data in memory
until it comes time to write the data out to a file. Why not write the
data as you receive it?
If anyone can guide me with some key perl commands or guide me to the
right direction I would really appreciate it.

If your desired lines may span more than one line, I'd say take
advantage of the .. (flip-flop) operator:

(untested)
#!/usr/bin/perl
use strict;
use warnings;

open my $in, '<', 'input.txt' or die "Cannot open input: $!";
open my $out, '>', 'output.txt' or die "Cannot open output: $!";

while (<$in>){
if (/^Error/ .. /^Junk/){
print $out $_;
}
}

close $in;
close $out;

__END__

Again, replace the pattern matches with the actual conditions that
determine where your Error messages begin and end.

Read more about the flip-flop operator in
perldoc perlop

Read about opening files in:
perldoc -f open

Read about regular expressions in:
perldoc perlre
perldoc perlretut
perldoc perlrequick

Hope this helps,
Paul Lalli
 
D

Debo

Hi all,

I am trying to write a program that scans a large log file and only
extracts the lines that pertain to an error message and writes it to a
different line. Basically the log file will be of this format :

Junk Junk ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~```
Error.............................................\n
Junk Junk Junk~~~~~~~~~~~~~~~~~~~~~~~~~~~~````
Error..............\n

I don't know if this is a kosher suggestion, but you don't really need
perl to do this if you're in *nix or cygwin (or your other favorite
variant).

grep "^Error" logfile | tr -d 'Error ' > output_file

That's quick and dirty, and I'm assuming the word 'Error' only at the
beginning of those error lines, but that's fixable without too much
hassle.

-Debo
 
G

Gunnar Hjalmarsson

I am trying to write a program that scans a large log file and only
extracts the lines that pertain to an error message and writes it to a
different line.

It does sound as a trivial task, if you know just a little about Perl.

http://learn.perl.org/
Now I know how to open the actual file

Good. Please show us! Actually, please show us the code you've got so
far, or else it will be difficult to help you get it right.

These are the posting guidelines for this group:
http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
and also looked up the tgrep function

What's that? Is it Perl?
 
S

shahriar_saberi

Thank you Paul,

That is great help.

Paul said:
It's usually a far better idea to give us samples of the *actual* data.
I have no idea if your error messages can span multiple lines, and if
so, is the indicator only present on the first line?

If NOT, that is if the Error message will only be on one line, and you
don't care about anything else, I'd do a one-liner:

perl -n -e"print if /^Error/" input.txt > output.txt

(Obviously, you'd want to replace /^Error/ with the actual condition).

This, of course, is really just a perl-way of writing:

grep ^Error input.txt > output.txt


Never heard of the tgrep function. Assuming a typo.


That's a pretty bad algorithm. You're storing all the data in memory
until it comes time to write the data out to a file. Why not write the
data as you receive it?


If your desired lines may span more than one line, I'd say take
advantage of the .. (flip-flop) operator:

(untested)
#!/usr/bin/perl
use strict;
use warnings;

open my $in, '<', 'input.txt' or die "Cannot open input: $!";
open my $out, '>', 'output.txt' or die "Cannot open output: $!";

while (<$in>){
if (/^Error/ .. /^Junk/){
print $out $_;
}
}

close $in;
close $out;

__END__

Again, replace the pattern matches with the actual conditions that
determine where your Error messages begin and end.

Read more about the flip-flop operator in
perldoc perlop

Read about opening files in:
perldoc -f open

Read about regular expressions in:
perldoc perlre
perldoc perlretut
perldoc perlrequick

Hope this helps,
Paul Lalli
 
S

shahriar_saberi

Debi,

The reason I need this to be in perl is because I want another perl
script program to call this procedure, that's all.

But thank you for your suggestion.
 
S

shahriar_saberi

Paul,

I have actually been trying your suggestion and I assumed there is a
space between the -e and "print. But when I run it I keep getting the
following error:

Bareword found where operator expected at C:\perlstuff\sample.pl line
print if /^Error/" input"
(Missing operator before input?)
syntax error at C:\perlstuff\sample.pl line 1, near "n -e"
Execution of C:\perlstuff\sample.pl aborted due to compilation errors.

Thanks in advance for any ideas.

Shah
 
P

Paul Lalli

I have actually been trying your suggestion and I assumed there is a
space between the -e and "print. But when I run it I keep getting the
following error:

Bareword found where operator expected at C:\perlstuff\sample.pl line
print if /^Error/" input"
(Missing operator before input?)
syntax error at C:\perlstuff\sample.pl line 1, near "n -e"
Execution of C:\perlstuff\sample.pl aborted due to compilation errors.

First, please learn how to correctly reply. Post your comments below
what you are replying to, and do not snip the attributions. Thank you.

Second, I do not understand what you're doing. That command I gave
should be issued on the command line. What is your sample.pl file?
There should not be any .pl file involved.

And no, there does not need to be a space between -e and "print.

Paul Lalli
 
J

Joe Smith

Paul,

I have actually been trying your suggestion and I assumed there is a
space between the -e and "print. But when I run it I keep getting the
following error:

Bareword found where operator expected at C:\perlstuff\sample.pl line
print if /^Error/" input"
(Missing operator before input?)
syntax error at C:\perlstuff\sample.pl line 1, near "n -e"
Execution of C:\perlstuff\sample.pl aborted due to compilation errors.

Get rid of sample.pl and it will work.

C:\perlstuff>perl -n -e"print if /^Error/" input.txt > output.txt
C:\perlstuff>dir output.txt
C:\perlstuff>perldoc perlrun

-Joe
 
J

Joe Smith

Debo said:
grep "^Error" logfile | tr -d 'Error ' > output_file

That's quick and dirty, and I'm assuming the word 'Error' only at the
beginning of those error lines, but that's fixable without too much
hassle.

That's _not_ how you use 'tr'.

echo Error: Everything is not all right | tr -d 'Error'
: veything is nt all ight

The solution needs 'sed -n' or a simple line of perl.

unix% perl -ne 's/^Error: // and print' input.txt >output.txt
C:\> perl -ne "s/^Error: // and print" input.txt >output.txt

-Joe
 
S

shahriar_saberi

Paul said:
First, please learn how to correctly reply. Post your comments below
what you are replying to, and do not snip the attributions. Thank you.

I am sorry , I hope this I am doing it correctly now :)
Second, I do not understand what you're doing. That command I gave
should be issued on the command line. What is your sample.pl file?
There should not be any .pl file involved.

Well I wasn't trying to run this from the command line, I just wanted
to run this from a perl file which is going to do other things as well
before getting to this procedure. I hope this clarifies things.

Thanks,
Shah
 
S

shahriar_saberi

Joe said:
Get rid of sample.pl and it will work.

C:\perlstuff>perl -n -e"print if /^Error/" input.txt > output.txt
C:\perlstuff>dir output.txt
C:\perlstuff>perldoc perlrun

Joe,

I was trying to run this procedure from within my perl file which is
called sample.pl. The idea is that there will be other operations as
well before getting to the one outlined above.

Thanks,
Shah
 
S

Sven-Thorsten Fahrbach

On 21 Jul 2005 13:42:00 -0700
Debi,

The reason I need this to be in perl is because I want another perl
script program to call this procedure, that's all.

Here's what I'd do:
As I see it, it's rather simple really, something that could easily be done with the regexp already mentioned before, i.e. something like /^Error\s*(.+)$/.
This filters out each line with the string 'Error' at the beginning, followed by any number of (including zero) whitespaces followed by the the remainder of the line. You would then find the relevant text in $1.
That's how it would look like in an actual program:

#!/usr/bin/perl

use strict;
use warnings;

open IN, "foo.txt" or die "Couldn't open foo.txt: $!";
open OUT, "> bar.txt" or die "Couldn't open bar.txt: $!";

while (<IN>) {
print OUT $1 if /^Error\s*(.+)$/i; # match case insensitively
}

Of course this virtually screams for a one liner but if you need to call it from within another script that's what you might try.

SveTho
 
P

Paul Lalli

I am sorry , I hope this I am doing it correctly now :)

Very much so. Thank you.
Well I wasn't trying to run this from the command line, I just wanted
to run this from a perl file which is going to do other things as well
before getting to this procedure. I hope this clarifies things.

Well then. You probably should have specified that originally....

In that case, you won't be able to take advantage of the magic of the
-n switch. You can look at perldoc perlrun to see exactly what -n does
and put that into your script. Basically, you're going to have three
steps:
* open the input file for reading
* open the output file for writing
* loop through all lines of the input file, printing to the output file
only those lines you want to keep

The finished chunk of code will look something like this: (UNTESTED)

open my $in, '<', 'input.txt' or die "Cannot open input: $!";
open my $out '>', 'output.txt' or die "Cannot open output: $!";
while (<$in>) {
print $out $_ if /^Error/;
}
close $in;
close $out;

You can read more about all of these lines of code in:
perldoc -f open
perldoc -f readline
perldoc -f print
perldoc perlre

Paul Lalli
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top