counting question.

T

Truthless

Hello All,

I am new to perl however I have a decent knowledge of scripting in
general. I am trying (in the name of science) to convert a bash script I
had writen into a cgi script. This script searches mail log files for
sending relays.

Using a series of loops and regex I am able to extract the the data I
want from the log. The only thing that is holding me up is that I want
to count the number of occurance of individual matchs.

The follwing is part of the code I am working with:

sub findrelay{
if ($_[0]=~ /relay=(.*)\s/){
print "$1\n";
}
}

This will print out all the the text after relay= in my sendmail log. I
want to be able to count the number of occurances of all individual
relays. I am not certaian in which direction to go. Could someone please
offer suggestions on how this is done. I found it easy with the grep,
uniq and sort commands in bash. I am basicaly looking for the equivilant
of uniq -c | sort -r any help would be greatly appriciated.


Thanks in advance,

T.
 
A

Anno Siegel

Truthless said:
Hello All,

I am new to perl however I have a decent knowledge of scripting in
general. I am trying (in the name of science) to convert a bash script I
had writen into a cgi script.
^^^^^^^^^^
CGI is an interface, not a scripting language. Do you mean "Perl script"?
This script searches mail log files for
sending relays.

Using a series of loops and regex I am able to extract the the data I
want from the log. The only thing that is holding me up is that I want
to count the number of occurance of individual matchs.

The follwing is part of the code I am working with:

sub findrelay{
if ($_[0]=~ /relay=(.*)\s/){
print "$1\n";
}
}

Ugh. No indentation.
This will print out all the the text after relay= in my sendmail log. I
want to be able to count the number of occurances of all individual
relays. I am not certaian in which direction to go. Could someone please
offer suggestions on how this is done. I found it easy with the grep,
uniq and sort commands in bash. I am basicaly looking for the equivilant
of uniq -c | sort -r any help would be greatly appriciated.

Use a hash for counting (untested):

sub findrelay {
( local $_, my $count) = @_;
if ( /relay=(.*)\s/ ) {
$count->{ $1}++;
print "$1\n";
}
}

....to be used like this:

my %count;
findrelay( $_, \ %count) while ( <LOG> );

After that, $count{ xyz} is the number oft times relay xyz was seen.

Anno
 
N

news

Truthless said:
I am new to perl however I have a decent knowledge of scripting in
general. I am trying (in the name of science) to convert a bash script I
had writen into a cgi script.

Anno Siegel said:
CGI is an interface, not a scripting language. Do you mean "Perl script"?

Actually in this case I think that "Truthless"'s description is a fair
one. A perl script that is implemented for use under CGI could reasonably
be called a cgi script. In the context of a perl newsgroup we would all
expect such a cgi script to be written in perl.

Regards,
Chris
 
J

John J. Trammell

Actually in this case I think that "Truthless"'s description is a fair
one. A perl script that is implemented for use under CGI could
reasonably be called a cgi script. In the context of a perl newsgroup
we would all expect such a cgi script to be written in perl.

Sorry to nitpick, but "converting a bash script to a CGI
script" doesn't make any sense. Many bash scripts are
already CGI scripts. A more sensible statement would be:

"I have a CGI script written in bash, and I'd like
to convert it to Perl."

or

"I have a bash script that I'd like to make available
on the web, so I'm going to rewrite it in Perl with a
CGI interface."

In other words, the bashness (bashitude?) is orthogonal
to the CGIness of a script. You're probably right, that
Truthless means to use Perl. But statements like that
indicate Muddled Thinking to me; best to clear them up.
 
T

Truthless

Anno said:
^^^^^^^^^^
CGI is an interface, not a scripting language. Do you mean "Perl script"?

The file name is spamhunt.pl and it runs as a cgi(outputs html). Sorry I
didn't think that people would assume that I was using something other
than perl, due to the whole comp.lang.perl.misc thing.


The follwing is part of the code I am working with:

sub findrelay{
if ($_[0]=~ /relay=(.*)\s/){
print "$1\n";
}
}


Ugh. No indentation.

It is indented in the actual file, this was a snip that I had typed. I
forgot to indent it. Oops.
Use a hash for counting (untested):

sub findrelay {
( local $_, my $count) = @_;
if ( /relay=(.*)\s/ ) {
$count->{ $1}++;
print "$1\n";
}
}

....to be used like this:

my %count;
findrelay( $_, \ %count) while ( <LOG> );

After that, $count{ xyz} is the number oft times relay xyz was seen.

Anno

This code looks like it will do the count trick but one question
remains. How do I keep track of what xyz's there are. Would I need
another hash? As far as my limited knowledge of perl goes I don't see
how I can list all unique relays AND count them.

How can I ensure I have a unique list of the relays and count how many
times they appeared in the log?
The "bashful" code is something like | sort | uniq -c | sort -r
Perhaps I missed that part.

Thanks for your help.

T.
 
G

gnari

John J. Trammell said:
Anno Siegel <[email protected]> wrote:
[snipped discussion]

In other words, the bashness (bashitude?) is orthogonal
to the CGIness of a script. You're probably right, that
Truthless means to use Perl. But statements like that
indicate Muddled Thinking to me; best to clear them up.

on the other hand, baybe the OP just wants to convert his bash script
to a cgi script, and thinks the only (or easiest) way to do that is to
convert it to Perl. Maybe he only has cgi examples in Perl.

if that is the case, and he is not specially interested in the Perl
conversion,
he *could* just make a minimal perl cgi script that just executes his
old bash script.

lets face it, not everyone is excited about Perl, for some reason.

gnari
 
T

Truthless

gnari said:

[snipped discussion]
if that is the case, and he is not specially interested in the Perl
conversion,
he *could* just make a minimal perl cgi script that just executes his
old bash script.

lets face it, not everyone is excited about Perl, for some reason.

gnari

I am (excitedly)trying to learn perl. The "(in the name of science)" was
supposed to let everyone know that while I could use a perl/cgi wrapper
and accomplish the task it would not help me learn how to sift through
logs in perl, which is my main goal. The bash script does the job and is
only about 6 lines of code. The perl version (incomplete) is up to
around 40.
Maybe I am going about it the wrong way. I did assume that it would take
some work to rewrite a bash script that uses grep, awk, sort, uniq and
cut in a perl script. I thought that maybe there was some built in
functions or basic methods that I have over looked. It just seems that
the script is getting clunky and complicated really quickly.

Thanks,

T.
 
G

gnari

Truthless said:
This code looks like it will do the count trick but one question
remains. How do I keep track of what xyz's there are. Would I need
another hash? As far as my limited knowledge of perl goes I don't see
how I can list all unique relays AND count them.

How can I ensure I have a unique list of the relays and count how many
times they appeared in the log?

print "relay=$_ count=$count{$_}\n" foreach keys %count;

gnari
 
J

John W. Krahn

Truthless said:
This code looks like it will do the count trick but one question
remains. How do I keep track of what xyz's there are. Would I need
another hash? As far as my limited knowledge of perl goes I don't see
how I can list all unique relays AND count them.

How can I ensure I have a unique list of the relays and count how many
times they appeared in the log?

By definition, hash keys ARE unique so just storing xyz as a hash key
ensures that it is unique. Incrementing the value of $count{xyz} keeps
the count for that unique key.


John
 
G

gnari

Truthless said:
I am (excitedly)trying to learn perl. The "(in the name of science)" was
supposed to let everyone know that while I could use a perl/cgi wrapper
and accomplish the task it would not help me learn how to sift through
logs in perl, which is my main goal.

in that case, by all means continue.
The bash script does the job and is
only about 6 lines of code. The perl version (incomplete) is up to
around 40.
Maybe I am going about it the wrong way. I did assume that it would take
some work to rewrite a bash script that uses grep, awk, sort, uniq and
cut in a perl script. I thought that maybe there was some built in
functions or basic methods that I have over looked. It just seems that
the script is getting clunky and complicated really quickly.

when you have something working, you can post it here, and I am sure
you will get lots of suggestions and criticisms

what does your script do apart from counting relays?

this would print the list of relays with their count sorted by count:

use strict;
use warnings;
my %count;
while (<>) {
$count{$1}++ if /relay=(\S*)/;
}
print "relay=$_ count=$count{$_}\n"
foreach sort {$count{$a}<=> $count{$b}} keys %count;

gnari
 
J

Jürgen Exner

Truthless said:
I am new to perl however I have a decent knowledge of scripting in
general. I am trying (in the name of science) to convert a bash
script I had writen into a cgi script.

Well, then I suggest that you ask in a newsgroup that deals with CGI.

In short (although off topic): make sure that your bash script
- returns the propper header for CGI resp. HTTP
- and that the actual data is formatted in HTML (althought that is not a
strict requirement, the script could return e.g. plain text instead).

jue
 
T

Truthless

Thanks to all of you who helped me out. I have finished the script. I
now present it for your examination. I can use all the advice I can get.
I had to open the file for reading twice, for some reason i couldn't
open it once and read it twice. The script take a domain name and
searches for all messages destined to that domain then counts the
relays. What is the perl cgi equivalent to the php var
$_SERVER["PHP_SELF"] ? I would like to use that in the form rather than
action="spamhunter.pl"

#!/usr/bin/perl
# spamhunter.pl
#Released under the GNU General Public License
#
$maillog = '/var/log/maillog'; # change to match your log location.



$domain = $ENV{'QUERY_STRING'}; # get domain name from form.
if ( $domain =~ /domain=(.*)/){
$domain = $1;
}

print "Content-type: text/html\n\n";
print "<html><body bgcolor=\"black\" text=\"white\">\n";

print <<EndOfHTML;
<h2>Spamhunter.pl</h2>Released under the GNU General Public License<br><br>
Search sendmail logs for relays that have sent mail to a domain
<br><br><br>
<form name="input" action="spamhunter.pl" method="get">
Enter the domain name:
<input type="text" name="domain" value="somedomain.com" >
<br>
<input type="submit" value="Search log">
</form>
EndOfHTML


if ($domain){
print "Results for $domain displayed below<br><br>";
print '========================================<br><br>';

open(INFILE, "$maillog");

while (<INFILE>){

if (/$domain/){
if (($_ =~ /:)\s\w+:)/) ){
push(@id, $1);
}
}
}
close INFILE;

open(INFILE, "$maillog");

foreach (<INFILE>){
if ($_ =~ /from/){
foreach $id (@id){
if ($_ =~ /$id/){
$count{$1}++ if /relay=(\S*)/;
print "$_ $count{$_} \n"
foreach sort {$count{$a}<=> $count{$b}} keys %count;
}
}
}
}
close INFILE;

}

print "</html></body>\n";
 
T

Tad McClellan

Truthless said:
open(INFILE, "$maillog");


You should always, yes *always*, check the return value from open().

perldoc -q vars

What's wrong with always quoting "$vars"?


open(INFILE, $maillog) or die "could not open '$maillog' $!";
 
G

gnari

Truthless said:
Thanks to all of you who helped me out. I have finished the script. I
now present it for your examination. I can use all the advice I can get.
#!/usr/bin/perl
use strict;
use warnings; # these are useful to spot errors/problems.
# spamhunter.pl
#Released under the GNU General Public License
#
$maillog = '/var/log/maillog'; # change to match your log location.

because of the use strict above , you need to declare all variables
so this would become:
my > $maillog = '/var/log/maillog'; # change to match your log location
$domain = $ENV{'QUERY_STRING'}; # get domain name from form.
my $domain .... as above
you might want to look at the CGI module. it can simplify cgi stuff
if ( $domain =~ /domain=(.*)/){
$domain = $1;
}
if there are more params after 'domain', you will get a strange result
$domain = $1 if $domain =~ /domain=([^&]*)/);
is better but still does not deal with urlencoded params.
so, again look at CGI module (perldoc CGI)
print "Content-type: text/html\n\n";
print "<html><body bgcolor=\"black\" text=\"white\">\n";

there are many ways to get rid of those \" .i see that you use HereDocs
later,
but do you know about qq() ?
print qq( said:
print <<EndOfHTML;
<h2>Spamhunter.pl</h2>Released under the GNU General Public
License said:
Search sendmail logs for relays that have sent mail to a domain
<br><br><br>
<form name="input" action="spamhunter.pl" method="get">
Enter the domain name:
<input type="text" name="domain" value="somedomain.com" >
<br>
<input type="submit" value="Search log">
</form>
EndOfHTML


if ($domain){
print "Results for $domain displayed below<br><br>";
print '========================================<br><br>';

some indenting would have been nice
open(INFILE, "$maillog");

dont forget to check if the open fails.

OK. explain what you are trying to do with the rest. I dont get it.
maybe a few samples of your input lines, and tell us what you are trying
to do. probably you do not need to read the file once, and I cannot imagine
you need to have the sort inside a double foreach loop.


gnari
 
K

Ken

I have a complex data structure (something like several nested arrays)
which I'm trying to save a copy of. However, everything I've tried will
only save a reference to it. I can't find anything on the distiction
(if there is one) between an array-of-arrays and an
array-of-references-to-arrays.

This test script sums up my problem. I'd like to keep copies of @state
as it changes in each iteration, but instead, I always end up with
references to the current value of @state.

--------------------------------

my @a = ( [0, 0], [0, 0] );
my @b = ( [0, 0], [0, 0] );

my (@state, @state_history);
$state[0] = [@a];
$state[1] = [@b];

print "LIVE:\n";
for (0 .. 2)
{
push(@{$state[0][0]}, $_); # change state
push(@state_history, [@state]); # want to save a copy of current state
print " step $_: a0 = " . join(' ', @{$state[0][0]}) . "\n";
}

print "REPLAY:\n";
for my $i (0 .. $#state_history)
{
my @curr_state = @{$state_history[$i]};
print " step $i: a0 = " . join(' ', @{$curr_state[0][0]}) . "\n";
}

----------------------------------

The output is:
LIVE:
step 0: a0 = 0 0 0
step 1: a0 = 0 0 0 1
step 2: a0 = 0 0 0 1 2
REPLAY:
step 0: a0 = 0 0 0 1 2
step 1: a0 = 0 0 0 1 2
step 2: a0 = 0 0 0 1 2

You see the "replay" contains only the final state, rather than the
history of the state updates. How can this be changed so that REPLAY
looks just like LIVE?
I thought using [@state] as opposed to \@state would work, but
appearenly not. How are these differnt?

Thanks in advance.
 
J

John J. Trammell

I have a complex data structure (something like several nested arrays)
which I'm trying to save a copy of.

This is addressed in Perl FAQ #4: "How do I print out or copy
a recursive data structure?".
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top