A little Direction Please

Discussion in 'Perl Misc' started by Andy, May 13, 2008.

  1. Andy

    Andy Guest

    Greets :)


    Q; I am trying to learn how to define some variables

    The basis of this script is to Scrub log files for ftp logins,
    seperate the successful logins

    Then create an array ( I hope the right terminology) to seperate it

    I hardcoded the log file, because I am looking for a way for it to
    scrub *.logs on a server

    but ...hey step by step right.

    Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
    bytes cs-host
    2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
    0598_Andy/qff0598.zip 226 0 -

    This field 226 0 - is a successful login

    My plan is to scrub the logs, export to file.

    sort fields into variable.

    I hope in the end to get

    1..log of successful logins
    2.log of last successful login ( I think I am going to try date
    comparison from most recent to last.)
    3 be able to parse the fields and get data.


    I know that there are those of you who are advanced, I would
    appreciate any directions or help.

    Again I am trying to put this together this is what I have so far.

    #!/usr/bin/perl
    use strict;
    use warnings;

    open(INPUT, '<', "ex080120.log")or die("Could not open log file.");
    open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");
    my $extractedLine;
    while (<INPUT>) {
    my $line = $_;
    if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {
    print OUTPUT "$1\n";
    }
    }
    close(INPUT);
    close(OUTPUT);
    exit;
    Andy, May 13, 2008
    #1
    1. Advertising

  2. Andy

    Ben Morrow Guest

    Quoth Andy <>:
    > Greets :)
    >
    > Q; I am trying to learn how to define some variables
    >
    > The basis of this script is to Scrub log files for ftp logins,
    > seperate the successful logins
    >
    > Then create an array ( I hope the right terminology) to seperate it
    >
    > I hardcoded the log file, because I am looking for a way for it to
    > scrub *.logs on a server
    >
    > but ...hey step by step right.
    >
    > Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
    > bytes cs-host
    > 2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
    > 0598_Andy/qff0598.zip 226 0 -


    What are these fields separated by? A single space? Can the fields ever
    contain spaces? How are they quoted in that case? What about newlines?

    > This field 226 0 - is a successful login
    >
    > My plan is to scrub the logs, export to file.
    >
    > sort fields into variable.
    >
    > I hope in the end to get
    >
    > 1..log of successful logins
    > 2.log of last successful login ( I think I am going to try date
    > comparison from most recent to last.)
    > 3 be able to parse the fields and get data.
    >
    >
    > I know that there are those of you who are advanced, I would
    > appreciate any directions or help.
    >
    > Again I am trying to put this together this is what I have so far.
    >
    > #!/usr/bin/perl
    > use strict;
    > use warnings;
    >
    > open(INPUT, '<', "ex080120.log")or die("Could not open log file.");
    > open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");


    3-arg open: good.
    Checking the return value: good.
    It's better to keep filehandles in variables than use the old-fashioned
    global handles, though; and if the open fails you should say what
    failed, and why:

    open(my $INPUT, '<', "ex080120.log")
    or die("can't read ex080120.log: $!");
    open(my $OUTPUT, '>', "ftpacct.log")
    or die("can't write ftpacct.log: $!);

    > my $extractedLine;
    > while (<INPUT>) {
    > my $line = $_;


    This is silly. If you want the line in $line, put it there in the first
    place:

    while (my $line = <$INPUT>) {

    > if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {
    > print OUTPUT "$1\n";
    > }


    I would recommend splitting the line into a hash first, and then
    selecting lines based on that. Something like

    my @fields = qw/
    date time c_ip
    cs_username cs_method cs_uri_stem
    sc_status sc_bytes cs_host
    /;

    while (my $line = <$INPUT>) {

    # Here I assume fields are delimited by a single space, and
    # spaces and newlines *never* appear in a field (not even inside
    # quotes). If this isn't true, you probably want to use the
    # Text::CSV_XS module, which can parse all sorts of
    # <foo>-delimited files.

    my %record;
    @record{@fields} = split / /, $line;

    $record{sc_status} == 226
    and $record{sc_bytes} == 0
    and $record{cs_host} eq '-'
    or next;

    print $OUTPUT $line;
    }

    Once you've understood that bit of code it should be straightforward to
    change it to do something more sophisticated. To keep track of the last
    login for any given user, you need a hash %lastlogin, keyed by username,
    that lives outside the loop.

    > }
    > close(INPUT);
    > close(OUTPUT);


    An advantage of keeping filehandles in variables is that they are closed
    for you when the variable goes out of scope. An advantage of real
    operating systems (Win32 counts, here) is that they close filehandles
    for you when the process exits, in any case.

    That said, there is value in explicitly closing a filehandle opened for
    writing, *and checking the return value*. If any of the writes to that
    filehandle failed (disk full, for instance) the error will be returned
    by close. (Of course, if you want to catch errors sooner than that, you
    can check the return value of print instead.)

    > exit;


    There's no need to explicitly exit from a Perl program. Falling off the
    end is the usual way to finish.

    Ben

    --
    I've seen things you people wouldn't believe: attack ships on fire off
    the shoulder of Orion; I watched C-beams glitter in the dark near the
    Tannhauser Gate. All these moments will be lost, in time, like tears in rain.
    Time to die.
    Ben Morrow, May 13, 2008
    #2
    1. Advertising

  3. Andy <> wrote:
    >Q; I am trying to learn how to define some variables


    To define a variable in Perl typically you use the assignment operator
    '='.

    >The basis of this script is to Scrub log files for ftp logins,
    >seperate the successful logins
    >
    >Then create an array ( I hope the right terminology) to seperate it
    >
    >I hardcoded the log file, because I am looking for a way for it to
    >scrub *.logs on a server
    >
    >but ...hey step by step right.
    >
    >Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
    >bytes cs-host
    > 2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
    >0598_Andy/qff0598.zip 226 0 -
    >
    >This field 226 0 - is a successful login
    >
    >My plan is to scrub the logs, export to file.
    >
    >sort fields into variable.
    >
    >I hope in the end to get
    >
    >1..log of successful logins
    >2.log of last successful login ( I think I am going to try date
    >comparison from most recent to last.)
    >3 be able to parse the fields and get data.
    >
    >
    >I know that there are those of you who are advanced, I would
    >appreciate any directions or help.
    >
    >Again I am trying to put this together this is what I have so far.
    >
    >#!/usr/bin/perl
    >use strict;
    >use warnings;
    >
    >open(INPUT, '<', "ex080120.log")or die("Could not open log file.");
    >open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");


    You might want to add the reason why the open() call failed and the file
    name for which it failed.

    >my $extractedLine;


    Why declare a variable that you never use again?

    >while (<INPUT>) {
    > my $line = $_;
    > if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {


    I know for some people it is difficult to just trust the default
    argument. But I would write this as
    while (<INPUT>) {
    if (m/^(.+226\s+0\s+-\s+.*)$/) {
    or
    while (my $line=<INPUT>) {
    if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

    > print OUTPUT "$1\n";
    > }
    >}
    >close(INPUT);
    >close(OUTPUT);


    You may want to check the success of the close() call, too, in
    particular for a file handle you wrote to.

    jue
    Jürgen Exner, May 13, 2008
    #3
  4. Andy wrote:
    > Greets :)
    >
    >
    > Q; I am trying to learn how to define some variables
    >
    > The basis of this script is to Scrub log files for ftp logins,
    > seperate the successful logins
    >
    > Then create an array ( I hope the right terminology) to seperate it
    >
    > I hardcoded the log file, because I am looking for a way for it to
    > scrub *.logs on a server
    >
    > but ...hey step by step right.
    >
    > Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
    > bytes cs-host
    > 2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
    > 0598_Andy/qff0598.zip 226 0 -
    >
    > This field 226 0 - is a successful login
    >
    > My plan is to scrub the logs, export to file.
    >
    > sort fields into variable.


    perldoc -f split


    >
    > I hope in the end to get
    >
    > 1..log of successful logins


    grep "226 0 - *$" ex*.log > ftpacct.log

    perl -n -e 'print if /226 0 - *$/' ex*.log > ftpacct.log

    > 2.log of last successful login ( I think I am going to try date
    > comparison from most recent to last.)


    Logfiles are generally in date order, you just need the last record.

    tail -n 1 successful-logins.log > last-successful-login.log

    > 3 be able to parse the fields and get data.
    >
    >
    > I know that there are those of you who are advanced, I would
    > appreciate any directions or help.
    >
    > Again I am trying to put this together this is what I have so far.
    >
    > #!/usr/bin/perl
    > use strict;
    > use warnings;


    Good!

    >
    > open(INPUT, '<', "ex080120.log")or die("Could not open log file.");


    Best practise is to ...
    - Use lexical filehandles
    - Include filename in message
    - Include the failure reason in the message

    my $filename = 'ex080120.log';
    open(my $input, '<', $filename)
    or die("Could not open '$filename' because $!");

    > open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");


    see above

    > my $extractedLine;


    Not used? Remove it.

    > while (<INPUT>) {
    > my $line = $_;


    It's sometimes easier to work with $_ than assign it to another
    variable. It would simplify your later code.

    > if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {


    Matching ^.+ is wasteful.
    You don't need to capture the whole line using ().

    > print OUTPUT "$1\n";


    Unless you chomp your input you'll output an extra blank line.

    Putting all the above together

    if (/226\s+0\s+-\s*$/) {
    print OUTPUT;

    OR

    print OUTPUT if /\s+0\s+-\s*$/;

    Though I'd use lexical filehandles, as I wrote earlier.

    print $output if /\s+0\s+-\s*$/;

    However to achieve your other aim, use your original construction and add
    $last_login = $line;
    my ($date, $time, ... $hyphen) = split;
    ...


    > }
    > }
    > close(INPUT);
    > close(OUTPUT);


    print "last successful login is $last_login";

    > exit;
    >



    Untested, caveat emptor.

    --
    RGB
    RedGrittyBrick, May 13, 2008
    #4
  5. Andy

    Andy Guest

    On May 13, 12:35 pm, RedGrittyBrick <>
    wrote:
    > Andy wrote:
    > > Greets :)

    >
    > > Q; I am trying to learn how to define some variables

    >
    > > The basis of this script is to Scrub log files for ftp logins,
    > > seperate the successful logins

    >
    > > Then create an array ( I hope the right terminology) to seperate it

    >
    > > I hardcoded the log file, because I am looking for a way for it to
    > > scrub *.logs on a server

    >
    > > but ...hey step by step right.

    >
    > > Fields: date time c-ip cs-username cs-method cs-uri-stem sc-status sc-
    > > bytes cs-host
    > > 2008-01-20 00:00:02 x.x.x.x 0598_Andy [6952041]sent /
    > > 0598_Andy/qff0598.zip 226 0 -

    >
    > > This field 226 0 - is a successful login

    >
    > > My plan is to scrub the logs, export to file.

    >
    > > sort fields into variable.

    >
    > perldoc -f split
    >
    >
    >
    > > I hope in the end to get

    >
    > > 1..log of successful logins

    >
    > grep "226 0 - *$" ex*.log > ftpacct.log
    >
    > perl -n -e 'print if /226 0 - *$/' ex*.log > ftpacct.log
    >
    > > 2.log of last successful login ( I think I am going to try date
    > > comparison from most recent to last.)

    >
    > Logfiles are generally in date order, you just need the last record.
    >
    > tail -n 1 successful-logins.log > last-successful-login.log
    >
    > > 3 be able to parse the fields and get data.

    >
    > > I know that there are those of you who are advanced, I would
    > > appreciate any directions or help.

    >
    > > Again I am trying to put this together this is what I have so far.

    >
    > > #!/usr/bin/perl
    > > use strict;
    > > use warnings;

    >
    > Good!
    >
    >
    >
    > > open(INPUT, '<', "ex080120.log")or die("Could not open log file.");

    >
    > Best practise is to ...
    > - Use lexical filehandles
    > - Include filename in message
    > - Include the failure reason in the message
    >
    > my $filename = 'ex080120.log';
    > open(my $input, '<', $filename)
    > or die("Could not open '$filename' because $!");
    >
    > > open(OUTPUT, '>',"ftpacct.log")or die("Could not open log file.");

    >
    > see above
    >
    > > my $extractedLine;

    >
    > Not used? Remove it.
    >
    > > while (<INPUT>) {
    > > my $line = $_;

    >
    > It's sometimes easier to work with $_ than assign it to another
    > variable. It would simplify your later code.
    >
    > > if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

    >
    > Matching ^.+ is wasteful.
    > You don't need to capture the whole line using ().
    >
    > > print OUTPUT "$1\n";

    >
    > Unless you chomp your input you'll output an extra blank line.
    >
    > Putting all the above together
    >
    > if (/226\s+0\s+-\s*$/) {
    > print OUTPUT;
    >
    > OR
    >
    > print OUTPUT if /\s+0\s+-\s*$/;
    >
    > Though I'd use lexical filehandles, as I wrote earlier.
    >
    > print $output if /\s+0\s+-\s*$/;
    >
    > However to achieve your other aim, use your original construction and add
    > $last_login = $line;
    > my ($date, $time, ... $hyphen) = split;
    > ...
    >
    > > }
    > > }
    > > close(INPUT);
    > > close(OUTPUT);

    >
    > print "last successful login is $last_login";
    >
    > > exit;

    >
    > Untested, caveat emptor.
    >
    > --
    > RGB


    WOW!

    Guys you opened my eyes up...I knew there were many ways to do this ,
    it is just confusing figuring out which one to use.
    I have of course google'd for file manipulations and sorting , I guess
    it just takes experience to figure out which is best.

    Thanks for the responses, all I have to do is figure out how to take
    what you have advised me and try to get it to work.

    I think I can safely say " progress in motion".....umm slowly. :)

    I will try your suggestions and see what happens.....

    -Thank you again

    GREATLY APPRECIATED :)
    Andy, May 13, 2008
    #5
  6. RedGrittyBrick <> wrote:
    >Andy wrote:
    >> if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

    >
    >Matching ^.+ is wasteful.
    >You don't need to capture the whole line using ().
    >
    >> print OUTPUT "$1\n";

    >
    >Unless you chomp your input you'll output an extra blank line.


    My first thought, too. However because of the rather 'interesting' way
    he is printing the captured group instead of just the plain line he is
    loosing the newline in the pattern match. Therefore he has to add it
    back explicitely.

    > print OUTPUT if /\s+0\s+-\s*$/;


    Much nicer, of course.

    jue
    Jürgen Exner, May 13, 2008
    #6
  7. Ben Morrow wrote:
    > Quoth Andy <>:
    >>
    >> if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {
    >> print OUTPUT "$1\n";
    >> }

    >
    > I would recommend splitting the line into a hash first, and then
    > selecting lines based on that. Something like
    >
    > my @fields = qw/
    > date time c_ip
    > cs_username cs_method cs_uri_stem
    > sc_status sc_bytes cs_host
    > /;
    >
    > while (my $line = <$INPUT>) {
    >
    > # Here I assume fields are delimited by a single space, and
    > # spaces and newlines *never* appear in a field (not even inside
    > # quotes). If this isn't true, you probably want to use the
    > # Text::CSV_XS module, which can parse all sorts of
    > # <foo>-delimited files.
    >
    > my %record;
    > @record{@fields} = split / /, $line;
    >
    > $record{sc_status} == 226
    > and $record{sc_bytes} == 0
    > and $record{cs_host} eq '-'


    Because you are using "split / /, $line" $record{cs_host} will probably
    contain "-\n" instead of '-'.


    > or next;
    >
    > print $OUTPUT $line;
    > }



    John
    --
    Perl isn't a toolbox, but a small machine shop where you
    can special-order certain sorts of tools at low cost and
    in short order. -- Larry Wall
    John W. Krahn, May 13, 2008
    #7
  8. Jürgen Exner wrote:
    > RedGrittyBrick <> wrote:
    >> Andy wrote:
    >>> if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {

    >> Matching ^.+ is wasteful.
    >> You don't need to capture the whole line using ().
    >>
    >>> print OUTPUT "$1\n";

    >> Unless you chomp your input you'll output an extra blank line.

    >
    > My first thought, too. However because of the rather 'interesting' way
    > he is printing the captured group instead of just the plain line he is
    > loosing the newline in the pattern match. Therefore he has to add it
    > back explicitely.


    The \s+ at the end is greedy and will match everything at the end
    including the newline unless there is a non-whitespace character after
    it that .* will match.


    John
    --
    Perl isn't a toolbox, but a small machine shop where you
    can special-order certain sorts of tools at low cost and
    in short order. -- Larry Wall
    John W. Krahn, May 13, 2008
    #8
  9. "John W. Krahn" <> wrote:
    >Jürgen Exner wrote:
    >> RedGrittyBrick <> wrote:
    >>> Andy wrote:
    >>>> if ($line =~ m/^(.+226\s+0\s+-\s+.*)$/) {
    >>> Matching ^.+ is wasteful.
    >>> You don't need to capture the whole line using ().
    >>>
    >>>> print OUTPUT "$1\n";
    >>> Unless you chomp your input you'll output an extra blank line.

    >>
    >> My first thought, too. However because of the rather 'interesting' way
    >> he is printing the captured group instead of just the plain line he is
    >> loosing the newline in the pattern match. Therefore he has to add it
    >> back explicitely.

    >
    >The \s+ at the end is greedy and will match everything at the end
    >including the newline unless there is a non-whitespace character after
    >it that .* will match.


    You are right. I was looking at the trailing .* only and didn't dissect
    the RE beyond that.
    This RE certainly has some Interesting side effects.

    jue
    Jürgen Exner, May 13, 2008
    #9
  10. Andy

    Ben Morrow Guest

    Quoth "John W. Krahn" <>:
    > Ben Morrow wrote:
    > >
    > > while (my $line = <$INPUT>) {

    <snip>
    > > my %record;
    > > @record{@fields} = split / /, $line;
    > >
    > > $record{sc_status} == 226
    > > and $record{sc_bytes} == 0
    > > and $record{cs_host} eq '-'

    >
    > Because you are using "split / /, $line" $record{cs_host} will probably
    > contain "-\n" instead of '-'.


    Good point. I'm too used to -l :)

    Ben

    --
    "Faith has you at a disadvantage, Buffy."
    "'Cause I'm not crazy, or 'cause I don't kill people?"
    "Both, actually."
    []
    Ben Morrow, May 13, 2008
    #10
  11. Andy <> wrote:

    > Subject: A little Direction Please



    Please put the subject of your article in the Subject of your article.


    > while (<INPUT>) {
    > my $line = $_;



    If you want the line in $line, then put it in $line rather than
    put it somewhere else, only to then copy it to $line:

    while ( my $line = <INPUT> ) {


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
    Tad J McClellan, May 14, 2008
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ike

    Jython direction please

    Ike, May 24, 2004, in forum: Python
    Replies:
    1
    Views:
    281
    Diez B. Roggisch
    May 24, 2004
  2. gary
    Replies:
    3
    Views:
    1,438
    EventHelix.com
    Oct 23, 2005
  3. ThaDoctor
    Replies:
    3
    Views:
    369
    Alan Woodland
    Sep 28, 2007
  4. len
    Replies:
    12
    Views:
    495
  5. Daniel
    Replies:
    1
    Views:
    194
    Bart van Ingen Schenau
    Jul 9, 2013
Loading...

Share This Page