Help for a new Perl User - Looking for suggestion

Discussion in 'Perl' started by scadav, Jul 2, 2004.

  1. scadav

    scadav Guest

    I am new to Perl and I am trying for figure out how to solve this
    problem. If anyone can give me some suggestions, I would greatly
    appreciate it.

    I am trying to read a log file and generate some statistics from it.
    For simplicity purposes (only) I have edited some of my logs and code.

    Below is an example of a log file which has 3 columns (separated by
    commas). The first column contains a time stamp, the second column
    contains a numeric identifier and the last column contains a NYSE
    Symbol:


    10:00,123,KO
    10:00,124,KO
    10:00,123,KO
    10:00,123,KO
    10:00,125,T
    10:00,125,T
    10:20,123,KO
    10:20,123,KO
    10:20,126,YY
    10:20,123,KO
    10:20,129,PP
    10:40,145,YY
    10:40,147,MM
    11:00,123,KO
    11:00,124,KO
    11:00,123,KO
    11:00,123,KO
    11:00,125,T
    11:00,125,T
    11:20,123,KO

    I am trying to determine at each time interval, how many times the
    numeric identifier appears. For example, I would like my output to
    look something like this:

    TIME NUMERIC IDENTIFIER OCCURENCES SYMBOL
    10:00 123 3 KO
    10:00 124 1 KO
    10:00 125 2 TO
    10:20 123 3 KO
    10:20 126 1 YY
    10:20 129 1 PP
    10:40 145 1 YY
    10:40 147 1 MM
    11:00 123 3 KO
    11:00 125 2 KO
    11:00 124 1 KO
    11:20 123 1 KO

    Please keep in mind that my log file contains roughly 70,000 rows of
    data


    I have been working on this some time and I am ABLE to determine the
    total number of messages in the time period (see below), but I am
    UNABLE to further break it out by numeric identifier. Can anyone
    recommend how I would do this? Below is how far I have gotten in the
    code so far:



    SAMPLE OF CODE:

    $samplelog = "test2.log";
    open (IN, "$samplelog");

    while ($rcd = <IN>){
    @fields = split(/,/,$rcd);
    $time{$fields[0]}++;

    }

    foreach $key (sort keys(%time)) {
    print "$key $time{$key} \n";
    }



    Thank you
     
    scadav, Jul 2, 2004
    #1
    1. Advertising

  2. scadav

    Jim Gibson Guest

    In article <>, scadav
    <> wrote:

    > I am new to Perl and I am trying for figure out how to solve this
    > problem. If anyone can give me some suggestions, I would greatly
    > appreciate it.
    >
    > I am trying to read a log file and generate some statistics from it.
    > For simplicity purposes (only) I have edited some of my logs and code.
    >
    > Below is an example of a log file which has 3 columns (separated by
    > commas). The first column contains a time stamp, the second column
    > contains a numeric identifier and the last column contains a NYSE
    > Symbol:
    >

    [data snipped (see below program, below)]

    > I am trying to determine at each time interval, how many times the
    > numeric identifier appears. For example, I would like my output to
    > look something like this:
    >
    > TIME NUMERIC IDENTIFIER OCCURENCES SYMBOL
    > 10:00 123 3 KO
    > 10:00 124 1 KO
    > 10:00 125 2 TO
    > 10:20 123 3 KO
    > 10:20 126 1 YY
    > 10:20 129 1 PP
    > 10:40 145 1 YY
    > 10:40 147 1 MM
    > 11:00 123 3 KO
    > 11:00 125 2 KO
    > 11:00 124 1 KO
    > 11:20 123 1 KO
    >
    > Please keep in mind that my log file contains roughly 70,000 rows of
    > data
    >
    >
    > I have been working on this some time and I am ABLE to determine the
    > total number of messages in the time period (see below), but I am
    > UNABLE to further break it out by numeric identifier. Can anyone
    > recommend how I would do this? Below is how far I have gotten in the
    > code so far:
    >
    >
    >
    > SAMPLE OF CODE:
    >
    > $samplelog = "test2.log";
    > open (IN, "$samplelog");
    >
    > while ($rcd = <IN>){
    > @fields = split(/,/,$rcd);
    > $time{$fields[0]}++;
    >
    > }
    >
    > foreach $key (sort keys(%time)) {
    > print "$key $time{$key} \n";
    > }


    Use the entire record as your key. Use chomp first to remove the
    newline at the end. Then split apart the record, which is also the key
    to the hash, to do the printing:

    Jim 48% cat scadav.pl
    #!/usr/local/bin/perl

    use strict;
    use warnings;

    my @fields;
    my %time;

    while (my $rcd = <DATA>){
    chomp($rcd);
    $time{$rcd}++;
    }

    foreach my $entry (sort keys(%time)) {
    my($tim,$num,$id) = split(/,/,$entry);
    printf " %5s %3d %3d %s\n", $tim, $num, $time{$entry}, $id;
    }
    __END__
    10:00,123,KO
    10:00,124,KO
    10:00,123,KO
    10:00,123,KO
    10:00,125,T
    10:00,125,T
    10:20,123,KO
    10:20,123,KO
    10:20,126,YY
    10:20,123,KO
    10:20,129,PP
    10:40,145,YY
    10:40,147,MM
    11:00,123,KO
    11:00,124,KO
    11:00,123,KO
    11:00,123,KO
    11:00,125,T
    11:00,125,T
    11:20,123,KO

    Jim 49% ./scadav.pl
    10:00 123 3 KO
    10:00 124 1 KO
    10:00 125 2 T
    10:20 123 3 KO
    10:20 126 1 YY
    10:20 129 1 PP
    10:40 145 1 YY
    10:40 147 1 MM
    11:00 123 3 KO
    11:00 124 1 KO
    11:00 125 2 T
    11:20 123 1 KO
    Jim 50%

    If you wish to break down your data in different ways or sort, then you
    need to consider more complicated solutions such as defining a
    hash-of-hashes or a hash-of-arrays to store your multi-level data.

    Some more pointers for those new to Perl:

    1. Always put 'use strict' and 'use warnings' at the beginning of your
    program, then declare all of your variables with 'my' or 'our' or use a
    package name for globals.

    2. Always check the results of an open call (and all other system
    calls, as well):

    open (IN, $samplelog) or die("Can't open $samplelog: $!";

    3. There is no need to put double-quotes around $samplelog in the above.

    4. Post further miscellaneous Perl questions to comp.lang.perl.misc, as
    this newsgroup is defunct, but be sure and check the guidelines for
    that newsgroup before doing so:

    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    Jim Gibson, Jul 2, 2004
    #2
    1. Advertising

  3. scadav

    scadav Guest

    Jim Gibson <> wrote in
    news:020720041255540732%:

    > In article <>, scadav
    > <> wrote:
    >
    >> I am new to Perl and I am trying for figure out how to solve this
    >> problem. If anyone can give me some suggestions, I would greatly
    >> appreciate it.
    >>
    >> I am trying to read a log file and generate some statistics from it.
    >> For simplicity purposes (only) I have edited some of my logs and
    >> code.
    >>
    >> Below is an example of a log file which has 3 columns (separated by
    >> commas). The first column contains a time stamp, the second column
    >> contains a numeric identifier and the last column contains a NYSE
    >> Symbol:
    >>

    > [data snipped (see below program, below)]
    >
    >> I am trying to determine at each time interval, how many times the
    >> numeric identifier appears. For example, I would like my output to
    >> look something like this:
    >>
    >> TIME NUMERIC IDENTIFIER OCCURENCES SYMBOL
    >> 10:00 123 3 KO
    >> 10:00 124 1 KO
    >> 10:00 125 2 TO
    >> 10:20 123 3 KO
    >> 10:20 126 1 YY
    >> 10:20 129 1 PP
    >> 10:40 145 1 YY
    >> 10:40 147 1 MM
    >> 11:00 123 3 KO
    >> 11:00 125 2 KO
    >> 11:00 124 1 KO
    >> 11:20 123 1 KO
    >>
    >> Please keep in mind that my log file contains roughly 70,000 rows of
    >> data
    >>
    >>
    >> I have been working on this some time and I am ABLE to determine the
    >> total number of messages in the time period (see below), but I am
    >> UNABLE to further break it out by numeric identifier. Can anyone
    >> recommend how I would do this? Below is how far I have gotten in the
    >> code so far:
    >>
    >>
    >>
    >> SAMPLE OF CODE:
    >>
    >> $samplelog = "test2.log";
    >> open (IN, "$samplelog");
    >>
    >> while ($rcd = <IN>){
    >> @fields = split(/,/,$rcd);
    >> $time{$fields[0]}++;
    >>
    >> }
    >>
    >> foreach $key (sort keys(%time)) {
    >> print "$key $time{$key} \n";
    >> }

    >
    > Use the entire record as your key. Use chomp first to remove the
    > newline at the end. Then split apart the record, which is also the key
    > to the hash, to do the printing:
    >
    > Jim 48% cat scadav.pl
    > #!/usr/local/bin/perl
    >
    > use strict;
    > use warnings;
    >
    > my @fields;
    > my %time;
    >
    > while (my $rcd = <DATA>){
    > chomp($rcd);
    > $time{$rcd}++;
    > }
    >
    > foreach my $entry (sort keys(%time)) {
    > my($tim,$num,$id) = split(/,/,$entry);
    > printf " %5s %3d %3d %s\n", $tim, $num, $time{$entry}, $id;
    > }
    > __END__
    > 10:00,123,KO
    > 10:00,124,KO
    > 10:00,123,KO
    > 10:00,123,KO
    > 10:00,125,T
    > 10:00,125,T
    > 10:20,123,KO
    > 10:20,123,KO
    > 10:20,126,YY
    > 10:20,123,KO
    > 10:20,129,PP
    > 10:40,145,YY
    > 10:40,147,MM
    > 11:00,123,KO
    > 11:00,124,KO
    > 11:00,123,KO
    > 11:00,123,KO
    > 11:00,125,T
    > 11:00,125,T
    > 11:20,123,KO
    >
    > Jim 49% ./scadav.pl
    > 10:00 123 3 KO
    > 10:00 124 1 KO
    > 10:00 125 2 T
    > 10:20 123 3 KO
    > 10:20 126 1 YY
    > 10:20 129 1 PP
    > 10:40 145 1 YY
    > 10:40 147 1 MM
    > 11:00 123 3 KO
    > 11:00 124 1 KO
    > 11:00 125 2 T
    > 11:20 123 1 KO
    > Jim 50%
    >
    > If you wish to break down your data in different ways or sort, then
    > you need to consider more complicated solutions such as defining a
    > hash-of-hashes or a hash-of-arrays to store your multi-level data.
    >
    > Some more pointers for those new to Perl:
    >
    > 1. Always put 'use strict' and 'use warnings' at the beginning of your
    > program, then declare all of your variables with 'my' or 'our' or use
    > a package name for globals.
    >
    > 2. Always check the results of an open call (and all other system
    > calls, as well):
    >
    > open (IN, $samplelog) or die("Can't open $samplelog: $!";
    >
    > 3. There is no need to put double-quotes around $samplelog in the
    > above.
    >
    > 4. Post further miscellaneous Perl questions to comp.lang.perl.misc,
    > as this newsgroup is defunct, but be sure and check the guidelines for
    > that newsgroup before doing so:
    >
    > http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
    >


    Thank you for your assistance and suggestions it is much appreciated.
     
    scadav, Jul 3, 2004
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Guest
    Replies:
    3
    Views:
    117
    David K. Wall
    Oct 14, 2003
  2. Vijayaraghavan Kalyanapasupathy

    Suggestion: Use EPIC as a perl editor/debugger

    Vijayaraghavan Kalyanapasupathy, Oct 20, 2004, in forum: Perl Misc
    Replies:
    1
    Views:
    126
    A. Sinan Unur
    Oct 20, 2004
  3. Perl/Mail Suggestion....

    , Nov 28, 2006, in forum: Perl Misc
    Replies:
    8
    Views:
    140
  4. Replies:
    2
    Views:
    499
    Thomas 'PointedEars' Lahn
    Mar 11, 2008
  5. Anthony Kong
    Replies:
    4
    Views:
    207
    Chris Angelico
    Sep 26, 2012
Loading...

Share This Page