need help with prog. logic

Discussion in 'Perl Misc' started by jlm33990, May 28, 2006.

  1. jlm33990

    jlm33990 Guest

    I'm trying to read a file which is sorted by field 3 which is an ip
    number with the port number added to the end. Each new record starts
    with the timestamp.
    Here's part of the file so you get the right idea...
    11:03:36.315447 IP 12.101.124.3.14459 > mandy2.me.com.smtp: S
    2740947399:27
    40947399(0) win 65535 <mss 1380,nop,nop,sackOK>
    11:03:39.203578 IP 12.101.124.3.14459 > mandy2.me.com.smtp: S
    2740947399:27
    40947399(0) win 65535 <mss 1380,nop,nop,sackOK>
    11:03:45.216961 IP 12.101.124.3.14459 > mandy2.me.com.smtp: S
    2740947399:27
    40947399(0) win 65535 <mss 1380,nop,nop,sackOK>
    11:18:56.302252 IP 12.101.124.3.16527 > mandy2.me.com.smtp: S
    222122110:222
    122110(0) win 65535 <mss 1380,nop,nop,sackOK>
    11:18:59.344184 IP 12.101.124.3.16527 > mandy2.me.com.smtp: S
    222122110:222
    122110(0) win 65535 <mss 1380,nop,nop,sackOK>
    11:29:42.569311 IP 12.102.102.180.36150 > mandy2.me.com.smtp: S
    416085218:4
    16085218(0) win 65535 <mss 1380,nop,nop,sackOK>
    11:29:48.527397 IP 12.102.102.180.36150 > mandy2.me.com.smtp: S
    416085218:4
    16085218(0) win 65535 <mss 1380,nop,nop,sackOK>
    10:52:36.447595 IP 12.103.253.170.10434 > mandy2.me.com.smtp: . ack 1
    win 6
    4512
    10:53:25.046979 IP 12.103.253.170.10434 > mandy2.me.com.smtp: . ack 258
    win
    64256

    For each unique ip number (with port not included) I want to print a
    summary showing ip# and how many records for that number.
    here';s the code that I've been struggling with.......

    #!/usr/bin/perl
    open(REPORT,">apost.report")|| die "cannot create report $!\n";
    print REPORT " apost.report\n";
    print REPORT "\n";
    print REPORT "IP# #packets
    \n";
    print REPORT
    "------------------------------------------------------------------
    ----------\n";
    format REPORT=
    @<<<<<<<<<<<<<<<<<<<<<@<<<<<<<<<<<<<<<<<<<<<<<<<<<<<@<<<<<<<<<<<<<<<<<<<<<<<<
    $ip,$hits,$lastip
    ..
    $lastip='12.101.124.3';
    $hits=0;
    $first="y";
    @ifile=`cat sorted`;
    foreach(@ifile) {
    ($time,$m,$ip1,$t) = split(/ /,$_);
    ($ip2,$ip3,$ip4,$ip5,$ip6) = split(/\./,$ip1);
    $ip="$ip2\.$ip3\.$ip4\.$ip5";
    if ($ip == $lastip) {
    $hits++;
    }
    else {
    write REPORT,"\n";
    $lastip=$ip;
    $hits=0;
    }
    }
    print REPORT
    "------------------------------------------------------------------
    ------\n";
    close(REPORT);

    Needless to say - it does'nt work (but i'm close) - can anyone point
    out the flaw please?
    Thanks - jim
    jlm33990, May 28, 2006
    #1
    1. Advertising

  2. jlm33990

    Guest

    jlm33990 <> wrote:
    > I'm trying to read a file which is sorted by field 3 which is an ip
    > number with the port number added to the end. Each new record starts
    > with the timestamp.


    > For each unique ip number (with port not included) I want to print a
    > summary showing ip# and how many records for that number.
    > here';s the code that I've been struggling with.......


    > #!/usr/bin/perl


    Please use:

    use warnings;
    use strict;

    Your problem is based on three things. The use of $ip and $lastip being
    one.

    > format REPORT=
    > @<<<<<<<<<<<<<<<<<<<<<@<<<<<<<<<<<<<<<<<<<<<<<<<<<<<@<<<<<<<<<<<<<<<<<<<<<<<<
    > $ip,$hits,$lastip


    ^^^^
    Note the variables used here.

    > .
    > $lastip='12.101.124.3';
    > $hits=0;
    > $first="y";
    > @ifile=`cat sorted`;
    > foreach(@ifile) {
    > ($time,$m,$ip1,$t) = split(/ /,$_);
    > ($ip2,$ip3,$ip4,$ip5,$ip6) = split(/\./,$ip1);
    > $ip="$ip2\.$ip3\.$ip4\.$ip5";


    ^^^^^
    Here you set $ip for each entry read.

    > if ($ip == $lastip) {
    > $hits++;
    > }
    > else {
    > write REPORT,"\n";


    ^^^^^
    You are trying to write $ip,$hits,$lastip. But $ip will be that of the
    last entry read, not the ip address you were counting hits for.

    So in the format, set the variables to: $lastip, $hits, $lastip

    > $lastip=$ip;
    > $hits=0;


    ^^^^^^^^
    This should be $hits = 1 since you have just read an entry with that ip.

    > }


    Here you do not write the REPORT for the last ip address you were
    counting hits for.

    > }
    > print REPORT


    > Needless to say - it does'nt work (but i'm close) - can anyone point
    > out the flaw please?


    A couple of other suggestions... (i) trying holding the entries for each
    ip address in a hash and then printing them out once all the file has
    been processed... this will allow an unsorted file to be processed.
    (ii) Instead of reading the file in using `cat sorted`, do a standard
    open and then loop through the file an entry at a time.

    Axel
    , May 28, 2006
    #2
    1. Advertising

  3. jlm33990

    MSG Guest

    jlm33990 wrote:

    > $ip="$ip2\.$ip3\.$ip4\.$ip5";

    No need to escape dots here.
    > if ($ip == $lastip) {

    Prog logic aside, the above line is probably your main problem:
    you are comparing two strings, not numbers!
    That means you does not have 'use strict' and 'use warnings' .
    You should always put them at the beginning of your perl programs
    so that you can save yourself and other people some trouble.
    MSG, May 28, 2006
    #3
  4. jlm33990

    jlm33990 Guest

    Thanks-
    Your point's are well taken.
    However, If I use strict & warnings the program will not run at all
    .....
    Global symbol "$hits" requires explicit package name at ./apost line
    27.
    Execution of ./apost aborted due to compilation errors.

    If I take out the use strict & warnings and implement your suggestions
    and MSG's the program runs and works perfectly in that it summarizes
    the first few ips properly but mysteriously stops after processing only
    the first 7 ip #'s. There are more ip's to be tally'd but for some
    reason they are not processed. It appears that eof was detected
    improperly.
    How can that be? any ideas?
    jlm33990, May 28, 2006
    #4
  5. jlm33990

    jlm33990 Guest

    Thanks - please see my reply to other helper....Jim
    jlm33990, May 28, 2006
    #5
  6. jlm33990

    Guest

    jlm33990 <> wrote:
    > Thanks-
    > Your point's are well taken.
    > However, If I use strict & warnings the program will not run at all


    > Global symbol "$hits" requires explicit package name at ./apost line


    That is the purpose of strict... it forces variables to be declared as
    lexical (i.e. local) variables (that is a *very* simplified
    explanation).

    So each local variable needs to be declared as local using 'my', either
    at in the outermost block in which it is used. Normally this can be
    combined with the first assignation to that variable, but sometimes the
    programming logic demands a declaration such as:

    my $hits;

    in the outermost block in which it will be used.

    > If I take out the use strict & warnings and implement your suggestions
    > and MSG's the program runs and works perfectly in that it summarizes
    > the first few ips properly but mysteriously stops after processing only
    > the first 7 ip #'s. There are more ip's to be tally'd but for some
    > reason they are not processed. It appears that eof was detected
    > improperly.


    It is maybe your use of:

    @ifile=`cat sorted`;

    instead of:

    open INFILE, "sorted" or die "File open failed: $!";
    @ifile = <INFILE>;
    close INFILE;

    or something inside the file itself... or a combination of the two.

    Ideally you should open the file and then step through the file an entry
    at a time with a

    while (<INFILE>) {
    ...
    }

    loop.

    By the way, lease quote some context when replying as otherwise it is
    difficult to know to which post you are replying.

    For what it is worth I enclose a stripped down version of your
    program based on the above but otherwise following your logic...
    see below.

    Axel

    #!/usr/bin/perl

    use strict;
    use warnings;

    my ($lastip, $ip, $hits);

    open(REPORT,">apost.report")|| die "cannot create report $!\n";
    format REPORT=
    @<<<<<<<<<<<<<<<<<<<<<@<<<<<<<<<<<<<<<<<<<<<<<<<<<<<@<<<<<<<<<<<<<<<<<<<<<<<<
    $lastip,$hits,$lastip
    ..

    $lastip='12.101.124.3';

    open INFILE, "sorted" or die "File open failed: $!";
    while (<INFILE>) {
    my ($time, $m, $ip1, $t) = split(/ /);
    my @ips = split(/\./, $ip1);
    $ip = join '.', @ips[ 0 .. 3 ];
    if ($ip eq $lastip) {
    $hits++;
    }
    else {
    write REPORT;
    $lastip = $ip;
    $hits = 1;
    }
    }
    write REPORT;
    close REPORT;
    close INFILE;

    __END__
    , May 28, 2006
    #6
  7. jlm33990 <> wrote:


    > For each unique ip number (with port not included) I want to print a
    > summary showing ip# and how many records for that number.



    Use a hash to uniqify and count things.


    -------------------------------
    #!/usr/bin/perl
    use warnings;
    use strict;

    my %cnt;
    while ( <DATA> ) {
    next unless /IP ([\d.]+)\./;
    $cnt{$1}++;
    }

    foreach ( sort keys %cnt ) {
    printf "%-20s %6d\n", $_, $cnt{$_};
    }

    __DATA__
    11:03:36.315447 IP 12.101.124.3.14459 > mandy2.me.com.smtp: S
    2740947399:27
    40947399(0) win 65535 <mss 1380,nop,nop,sackOK>
    11:03:39.203578 IP 12.101.124.3.14459 > mandy2.me.com.smtp: S
    2740947399:27
    40947399(0) win 65535 <mss 1380,nop,nop,sackOK>
    11:03:45.216961 IP 12.101.124.3.14459 > mandy2.me.com.smtp: S
    2740947399:27
    40947399(0) win 65535 <mss 1380,nop,nop,sackOK>
    11:18:56.302252 IP 12.101.124.3.16527 > mandy2.me.com.smtp: S
    222122110:222
    122110(0) win 65535 <mss 1380,nop,nop,sackOK>
    11:18:59.344184 IP 12.101.124.3.16527 > mandy2.me.com.smtp: S
    222122110:222
    122110(0) win 65535 <mss 1380,nop,nop,sackOK>
    11:29:42.569311 IP 12.102.102.180.36150 > mandy2.me.com.smtp: S
    416085218:4
    16085218(0) win 65535 <mss 1380,nop,nop,sackOK>
    11:29:48.527397 IP 12.102.102.180.36150 > mandy2.me.com.smtp: S
    416085218:4
    16085218(0) win 65535 <mss 1380,nop,nop,sackOK>
    10:52:36.447595 IP 12.103.253.170.10434 > mandy2.me.com.smtp: . ack 1
    win 6
    4512
    10:53:25.046979 IP 12.103.253.170.10434 > mandy2.me.com.smtp: . ack 258
    win
    64256
    -------------------------------


    > #!/usr/bin/perl



    You should always enable warnings and strict when developing Perl code.

    use warnings;
    use strict;


    > format REPORT=



    I didn't think anybody used formats anymore...


    > @ifile=`cat sorted`;



    That won't work if you move to a system that does not have
    a "cat" command.

    You should use native Perl to increase portability.

    You can read a file using the diamond operator:

    { local @ARGV = 'sorted';
    @ifile = <>;
    }

    or you can make your own filehandle with Perl's open() function.


    > ($time,$m,$ip1,$t) = split(/ /,$_);
    > ($ip2,$ip3,$ip4,$ip5,$ip6) = split(/\./,$ip1);



    Why are you grabbing all kinds of stuff that you don't intend
    to use rather than just the stuff that you do intend to use?


    > $ip="$ip2\.$ip3\.$ip4\.$ip5";



    Dots are not special in strings, so there is no need to backslash them.


    > if ($ip == $lastip) {



    That is testing for numerical equality.

    You want to test for string equality instead, since your data are
    strings, not numbers:

    if ($ip eq $lastip) {


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, May 28, 2006
    #7
  8. jlm33990 wrote:
    > I'm trying to read a file which is sorted by field 3 which is an ip
    > number with the port number added to the end. Each new record starts
    > with the timestamp.
    > Here's part of the file so you get the right idea...
    > 11:03:36.315447 IP 12.101.124.3.14459 > mandy2.me.com.smtp: S
    > 2740947399:27
    > 40947399(0) win 65535 <mss 1380,nop,nop,sackOK>
    >
    > [snip]
    >
    > 10:52:36.447595 IP 12.103.253.170.10434 > mandy2.me.com.smtp: . ack 1
    > win 6
    > 4512
    > 10:53:25.046979 IP 12.103.253.170.10434 > mandy2.me.com.smtp: . ack 258
    > win
    > 64256
    >
    > For each unique ip number (with port not included) I want to print a
    > summary showing ip# and how many records for that number.
    > here';s the code that I've been struggling with.......
    >
    > #!/usr/bin/perl
    >
    > [snip]
    >
    > close(REPORT);
    >
    > Needless to say - it does'nt work (but i'm close) - can anyone point
    > out the flaw please?


    This should do what you want:

    #!/usr/bin/perl
    use warnings;
    use strict;


    open REPORT, '>', 'apost.report' or die "cannot open 'apost.report' $!\n";
    open SORTED, '<', 'sorted' or die "cannot open 'sorted' $!\n";

    print REPORT <<'HEADER';
    apost.report

    IP# #packets
    ----------------------------------------------------------------------------
    HEADER

    my $format = "%-22s%-30d\n";
    my %data;

    while ( <SORTED> ) {
    my ( $ip ) = /^[\d:.]+ +IP +(\d+\.\d+\.\d+\.\d+)\.\d+ +/ or next;

    if ( %data and not exists $data{ $ip } ) {
    printf REPORT $format, %data;
    %data = ();
    }

    $data{ $ip }++;
    }
    printf REPORT $format, %data;


    print REPORT <<'TRAILER';
    ------------------------------------------------------------------------
    TRAILER

    __END__


    John
    --
    use Perl;
    program
    fulfillment
    John W. Krahn, May 29, 2006
    #8
  9. jlm33990

    jlm33990 Guest

    Tad-
    Thanks for the help first of all.
    I did run your program - needless to say I works perfectly.
    Couple of questions.
    First I'm assuming when you say hash that is the same as an associative
    array - correct?
    Second I need some help understanding your logic - specifically the
    next unless statement. I'm having a hard time conceptualizing what you
    are comparing the regular expression to. I don't see you saving the
    info from prior records. I also don't understand the actual regular
    expression. If I understand \d, . and + properly it seems you are
    searching for the literal IP followed by a space followed by a digit
    and any character except newline(\d.) any number of times(+) followed
    by a dot. I dont see how this is specific enough to get the job done.
    Again I also don't see what you are comparing it to. Obviously I'm
    missing something.
    If you could enlighten this humble programmer I would be most
    appreciative. Thanks
    jlm33990, May 30, 2006
    #9
  10. jlm33990 <> wrote:
    > Tad-
    > Thanks for the help first of all.



    Errr, you're welcome.

    But what help are you talking about?

    Please quote some context in followups like everybody else does.


    > First I'm assuming when you say hash that is the same as an associative
    > array - correct?



    Yes, except that Perl programmers have not called them that for
    about 10 years now.

    Where did you learn the term from?


    > Second I need some help understanding your logic - specifically the
    > next unless statement. I'm having a hard time conceptualizing what you
    > are comparing the regular expression to.



    Errr, I don't remember what the heck the "next unless" code was...


    So, which is it that you are having trouble with?

    Is it the next?

    The description of next is in:

    perldoc -f next

    Is it the unless?

    The description of unless is in:

    perldoc perlsyn

    Is it what string is being compared to the pattern?

    The description of the m/PATTERN/ operator is in:

    perldoc perlop

    It says rather clearly what string is searched when you
    don't use the binding operator (=~).


    > I don't see you saving the
    > info from prior records.



    I count them (in the hash) as I encounter them.


    > I also don't understand the actual regular
    > expression. If I understand \d, . and + properly it seems you are
    > searching for the literal IP followed by a space followed by a digit
    > and any character except newline(\d.) any number of times(+) followed
    > by a dot.



    I don't remember what I was doing.

    I post lots of code for folks, I don't remember what I posted for you.

    I haven't had news access lately, so it has been several days since
    I wrote whatever it was I wrote.


    > Again I also don't see what you are comparing it to. Obviously I'm
    > missing something.



    $_


    > If you could enlighten this humble programmer I would be most
    > appreciative. Thanks



    You have not made it easy enough for me to respond helpfully.

    I'm not going to take the time to go search my archives to find
    out what you are talking about.

    If you want to discuss code, then quote the code that you are
    discussing, and I would be happy to help.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Jun 3, 2006
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. whoopsi

    help w/ simple GPIB prog.

    whoopsi, Oct 4, 2005, in forum: Python
    Replies:
    1
    Views:
    603
    Torsten Bronger
    Oct 4, 2005
  2. -intl.com
    Replies:
    1
    Views:
    358
    Martin Gregorie
    Oct 22, 2006
  3. sara
    Replies:
    4
    Views:
    332
  4. Replies:
    2
    Views:
    242
    Andrew Thompson
    Oct 29, 2007
  5. spike
    Replies:
    8
    Views:
    1,445
    Steve Holden
    Feb 9, 2010
Loading...

Share This Page