A little Direction Please

A

Andy

Greets :)


Q; I am trying to learn how to define some variables

The basis of this script is to Scrub log files for ftp logins,
seperate the successful logins

Then create an array ( I hope the right terminology) to seperate it

I hardcoded the log file, because I am looking for a way for it to
scrub *.logs on a server

but ...hey step by step right.

Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
bytes cs-host
2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
0598_Andy/qff0598.zip 226 0 -

This field 226 0 - is a successful login

My plan is to scrub the logs, export to file.

sort fields into variable.

I hope in the end to get

1..log of successful logins
2.log of last successful login ( I think I am going to try date
comparison from most recent to last.)
3 be able to parse the fields and get data.


I know that there are those of you who are advanced, I would
appreciate any directions or help.

Again I am trying to put this together this is what I have so far.

#!/usr/bin/perl
use strict;
use warnings;

open(INPUT, '<', "ex080120.log")or die("Could not open log file.");
open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");
my $extractedLine;
while (<INPUT>) {
my $line = $_;
if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {
print OUTPUT "$1\n";
}
}
close(INPUT);
close(OUTPUT);
exit;
 
B

Ben Morrow

Quoth Andy said:
Greets :)

Q; I am trying to learn how to define some variables

The basis of this script is to Scrub log files for ftp logins,
seperate the successful logins

Then create an array ( I hope the right terminology) to seperate it

I hardcoded the log file, because I am looking for a way for it to
scrub *.logs on a server

but ...hey step by step right.

Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
bytes cs-host
2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
0598_Andy/qff0598.zip 226 0 -

What are these fields separated by? A single space? Can the fields ever
contain spaces? How are they quoted in that case? What about newlines?
This field 226 0 - is a successful login

My plan is to scrub the logs, export to file.

sort fields into variable.

I hope in the end to get

1..log of successful logins
2.log of last successful login ( I think I am going to try date
comparison from most recent to last.)
3 be able to parse the fields and get data.


I know that there are those of you who are advanced, I would
appreciate any directions or help.

Again I am trying to put this together this is what I have so far.

#!/usr/bin/perl
use strict;
use warnings;

open(INPUT, '<', "ex080120.log")or die("Could not open log file.");
open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");

3-arg open: good.
Checking the return value: good.
It's better to keep filehandles in variables than use the old-fashioned
global handles, though; and if the open fails you should say what
failed, and why:

open(my $INPUT, '<', "ex080120.log")
or die("can't read ex080120.log: $!");
open(my $OUTPUT, '>', "ftpacct.log")
or die("can't write ftpacct.log: $!);
my $extractedLine;
while (<INPUT>) {
my $line = $_;

This is silly. If you want the line in $line, put it there in the first
place:

while (my $line = said:
if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {
print OUTPUT "$1\n";
}

I would recommend splitting the line into a hash first, and then
selecting lines based on that. Something like

my @fields = qw/
date time c_ip
cs_username cs_method cs_uri_stem
sc_status sc_bytes cs_host
/;

while (my $line = <$INPUT>) {

# Here I assume fields are delimited by a single space, and
# spaces and newlines *never* appear in a field (not even inside
# quotes). If this isn't true, you probably want to use the
# Text::CSV_XS module, which can parse all sorts of
# <foo>-delimited files.

my %record;
@record{@fields} = split / /, $line;

$record{sc_status} == 226
and $record{sc_bytes} == 0
and $record{cs_host} eq '-'
or next;

print $OUTPUT $line;
}

Once you've understood that bit of code it should be straightforward to
change it to do something more sophisticated. To keep track of the last
login for any given user, you need a hash %lastlogin, keyed by username,
that lives outside the loop.
}
close(INPUT);
close(OUTPUT);

An advantage of keeping filehandles in variables is that they are closed
for you when the variable goes out of scope. An advantage of real
operating systems (Win32 counts, here) is that they close filehandles
for you when the process exits, in any case.

That said, there is value in explicitly closing a filehandle opened for
writing, *and checking the return value*. If any of the writes to that
filehandle failed (disk full, for instance) the error will be returned
by close. (Of course, if you want to catch errors sooner than that, you
can check the return value of print instead.)

There's no need to explicitly exit from a Perl program. Falling off the
end is the usual way to finish.

Ben
 
J

Jürgen Exner

Andy said:
Q; I am trying to learn how to define some variables

To define a variable in Perl typically you use the assignment operator
'='.
The basis of this script is to Scrub log files for ftp logins,
seperate the successful logins

Then create an array ( I hope the right terminology) to seperate it

I hardcoded the log file, because I am looking for a way for it to
scrub *.logs on a server

but ...hey step by step right.

Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
bytes cs-host
2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
0598_Andy/qff0598.zip 226 0 -

This field 226 0 - is a successful login

My plan is to scrub the logs, export to file.

sort fields into variable.

I hope in the end to get

1..log of successful logins
2.log of last successful login ( I think I am going to try date
comparison from most recent to last.)
3 be able to parse the fields and get data.


I know that there are those of you who are advanced, I would
appreciate any directions or help.

Again I am trying to put this together this is what I have so far.

#!/usr/bin/perl
use strict;
use warnings;

open(INPUT, '<', "ex080120.log")or die("Could not open log file.");
open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");

You might want to add the reason why the open() call failed and the file
name for which it failed.
my $extractedLine;

Why declare a variable that you never use again?
while (<INPUT>) {
my $line = $_;
if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

I know for some people it is difficult to just trust the default
argument. But I would write this as
while (<INPUT>) {
if (m/^(.+226\s+0\s+-\s+.*)$/) {
or
print OUTPUT "$1\n";
}
}
close(INPUT);
close(OUTPUT);

You may want to check the success of the close() call, too, in
particular for a file handle you wrote to.

jue
 
R

RedGrittyBrick

Andy said:
Greets :)


Q; I am trying to learn how to define some variables

The basis of this script is to Scrub log files for ftp logins,
seperate the successful logins

Then create an array ( I hope the right terminology) to seperate it

I hardcoded the log file, because I am looking for a way for it to
scrub *.logs on a server

but ...hey step by step right.

Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
bytes cs-host
2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
0598_Andy/qff0598.zip 226 0 -

This field 226 0 - is a successful login

My plan is to scrub the logs, export to file.

sort fields into variable.

perldoc -f split

I hope in the end to get

1..log of successful logins

grep "226 0 - *$" ex*.log > ftpacct.log

perl -n -e 'print if /226 0 - *$/' ex*.log > ftpacct.log
2.log of last successful login ( I think I am going to try date
comparison from most recent to last.)

Logfiles are generally in date order, you just need the last record.

tail -n 1 successful-logins.log > last-successful-login.log
3 be able to parse the fields and get data.


I know that there are those of you who are advanced, I would
appreciate any directions or help.

Again I am trying to put this together this is what I have so far.

#!/usr/bin/perl
use strict;
use warnings;
Good!


open(INPUT, '<', "ex080120.log")or die("Could not open log file.");

Best practise is to ...
- Use lexical filehandles
- Include filename in message
- Include the failure reason in the message

my $filename = 'ex080120.log';
open(my $input, '<', $filename)
or die("Could not open '$filename' because $!");
open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");

see above
my $extractedLine;

Not used? Remove it.
while (<INPUT>) {
my $line = $_;

It's sometimes easier to work with $_ than assign it to another
variable. It would simplify your later code.
if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

Matching ^.+ is wasteful.
You don't need to capture the whole line using ().
print OUTPUT "$1\n";

Unless you chomp your input you'll output an extra blank line.

Putting all the above together

if (/226\s+0\s+-\s*$/) {
print OUTPUT;

OR

print OUTPUT if /\s+0\s+-\s*$/;

Though I'd use lexical filehandles, as I wrote earlier.

print $output if /\s+0\s+-\s*$/;

However to achieve your other aim, use your original construction and add
$last_login = $line;
my ($date, $time, ... $hyphen) = split;
...

}
}
close(INPUT);
close(OUTPUT);

print "last successful login is $last_login";


Untested, caveat emptor.
 
A

Andy

Andy said:
Q; I am trying to learn how to define some variables
The basis of this script is to Scrub log files for ftp logins,
seperate the successful logins
Then create an array ( I hope the right terminology) to seperate it
I hardcoded the log file, because I am looking for a way for it to
scrub *.logs on a server
but ...hey step by step right.
Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
bytes cs-host
2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
0598_Andy/qff0598.zip 226 0 -
This field 226 0 - is a successful login
My plan is to scrub the logs, export to file.
sort fields into variable.

perldoc -f split


I hope in the end to get
1..log of successful logins

grep "226 0 - *$" ex*.log > ftpacct.log

perl -n -e 'print if /226 0 - *$/' ex*.log > ftpacct.log
2.log of last successful login ( I think I am going to try date
comparison from most recent to last.)

Logfiles are generally in date order, you just need the last record.

tail -n 1 successful-logins.log > last-successful-login.log
3 be able to parse the fields and get data.
I know that there are those of you who are advanced, I would
appreciate any directions or help.
Again I am trying to put this together this is what I have so far.
#!/usr/bin/perl
use strict;
use warnings;
Good!



open(INPUT, '<', "ex080120.log")or die("Could not open log file.");

Best practise is to ...
- Use lexical filehandles
- Include filename in message
- Include the failure reason in the message

my $filename = 'ex080120.log';
open(my $input, '<', $filename)
or die("Could not open '$filename' because $!");
open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");

see above
my $extractedLine;

Not used? Remove it.
while (<INPUT>) {
my $line = $_;

It's sometimes easier to work with $_ than assign it to another
variable. It would simplify your later code.
if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

Matching ^.+ is wasteful.
You don't need to capture the whole line using ().
print OUTPUT "$1\n";

Unless you chomp your input you'll output an extra blank line.

Putting all the above together

if (/226\s+0\s+-\s*$/) {
print OUTPUT;

OR

print OUTPUT if /\s+0\s+-\s*$/;

Though I'd use lexical filehandles, as I wrote earlier.

print $output if /\s+0\s+-\s*$/;

However to achieve your other aim, use your original construction and add
$last_login = $line;
my ($date, $time, ... $hyphen) = split;
...
}
}
close(INPUT);
close(OUTPUT);

print "last successful login is $last_login";

Untested, caveat emptor.

WOW!

Guys you opened my eyes up...I knew there were many ways to do this ,
it is just confusing figuring out which one to use.
I have of course google'd for file manipulations and sorting , I guess
it just takes experience to figure out which is best.

Thanks for the responses, all I have to do is figure out how to take
what you have advised me and try to get it to work.

I think I can safely say " progress in motion".....umm slowly. :)

I will try your suggestions and see what happens.....

-Thank you again

GREATLY APPRECIATED :)
 
J

Jürgen Exner

RedGrittyBrick said:
Matching ^.+ is wasteful.
You don't need to capture the whole line using ().


Unless you chomp your input you'll output an extra blank line.

My first thought, too. However because of the rather 'interesting' way
he is printing the captured group instead of just the plain line he is
loosing the newline in the pattern match. Therefore he has to add it
back explicitely.
print OUTPUT if /\s+0\s+-\s*$/;

Much nicer, of course.

jue
 
J

John W. Krahn

Ben said:
I would recommend splitting the line into a hash first, and then
selecting lines based on that. Something like

my @fields = qw/
date time c_ip
cs_username cs_method cs_uri_stem
sc_status sc_bytes cs_host
/;

while (my $line = <$INPUT>) {

# Here I assume fields are delimited by a single space, and
# spaces and newlines *never* appear in a field (not even inside
# quotes). If this isn't true, you probably want to use the
# Text::CSV_XS module, which can parse all sorts of
# <foo>-delimited files.

my %record;
@record{@fields} = split / /, $line;

$record{sc_status} == 226
and $record{sc_bytes} == 0
and $record{cs_host} eq '-'

Because you are using "split / /, $line" $record{cs_host} will probably
contain "-\n" instead of '-'.

or next;

print $OUTPUT $line;
}


John
 
J

John W. Krahn

Jürgen Exner said:
My first thought, too. However because of the rather 'interesting' way
he is printing the captured group instead of just the plain line he is
loosing the newline in the pattern match. Therefore he has to add it
back explicitely.

The \s+ at the end is greedy and will match everything at the end
including the newline unless there is a non-whitespace character after
it that .* will match.


John
 
J

Jürgen Exner

John W. Krahn said:
The \s+ at the end is greedy and will match everything at the end
including the newline unless there is a non-whitespace character after
it that .* will match.

You are right. I was looking at the trailing .* only and didn't dissect
the RE beyond that.
This RE certainly has some Interesting side effects.

jue
 
B

Ben Morrow

Quoth "John W. Krahn said:
Because you are using "split / /, $line" $record{cs_host} will probably
contain "-\n" instead of '-'.

Good point. I'm too used to -l :)

Ben
 
T

Tad J McClellan

Andy said:
Subject: A little Direction Please


Please put the subject of your article in the Subject of your article.

while (<INPUT>) {
my $line = $_;


If you want the line in $line, then put it in $line rather than
put it somewhere else, only to then copy it to $line:

while ( my $line = <INPUT> ) {
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top