perl style: can I combine two steps into one?

M

Michael Slass

I have a hash where the keys are strings and the values are integers.
I wanted to produce a report of the name/value pairs, sorted by the
integers in descending order.

I came up with this:

,----
| #!/usr/bin/perl -w
| use strict;
|
| my %hack_count;
| my @report;
|
| # read in ssh attempts from /var/log/messages*
| while (<>)
| {
| if (/sshd.+rhost=(\S+)/)
| {
| ++$hack_count{$1};
| }
| }
|
| # create a list of (count, ip) cells, sorted by count
| my ($ip, $count);
| while (($ip, $count) = each %hack_count)
| {
| push @report, [$count, $ip];
| }
|
| foreach (sort { $b->[0] <=> $a->[0]; } @report) {
| printf("%-50s\t%4d\n", $_->[1], $_->[0]);
| }
`----

Is there a way to combine the "each" and "push" statements so I don't
need the intermediate variables $ip and $count? The order of the
variables in the anonymous lists is not important; I just put the
numerical value first since I figured I'd remember to sort on the
first element of the anonymous lists.

Thanks.
 
P

Paul Lalli

Michael Slass said:
I have a hash where the keys are strings and the values are integers.
I wanted to produce a report of the name/value pairs, sorted by the
integers in descending order.

I came up with this:
| # create a list of (count, ip) cells, sorted by count
| my ($ip, $count);
| while (($ip, $count) = each %hack_count)
| {
| push @report, [$count, $ip];
| }
Is there a way to combine the "each" and "push" statements so I don't
need the intermediate variables $ip and $count?

my @report = map {[$hack_count{$_}, $_]} keys %hack_count;
# OR . . .
my @report;
push @report, [$hack_count{$_}, $_] for keys %hack_count;

Of course, these are both still using an intermediate variable of $_.
That may or may not be 'good enough' for you.

Paul Lalli
 
J

John Bokma

Michael said:
I have a hash where the keys are strings and the values are integers.
I wanted to produce a report of the name/value pairs, sorted by the
integers in descending order.

I came up with this:

,----
| #!/usr/bin/perl -w

drop the -w and

use warnings;
| # read in ssh attempts from /var/log/messages*
| while (<>)
| {
| if (/sshd.+rhost=(\S+)/)
| {
| ++$hack_count{$1};

I recommend using $hack_count{$1}++;

[ sorting ]

printf "%-50s\t%4d\n", $_, $hack_count{ $_ }
for sort { $hack_count{ $b }<=>$hack_count{ $a } } keys %hack_count;
 
M

mjl69

I have a hash where the keys are strings and the values are integers.
I wanted to produce a report of the name/value pairs, sorted by the
integers in descending order.

I came up with this:

,----
| #!/usr/bin/perl -w
| use strict;
|
| my %hack_count;
| my @report;
|
| # read in ssh attempts from /var/log/messages*
| while (<>)
| {
| if (/sshd.+rhost=(\S+)/)
| {
| ++$hack_count{$1};
| }
| }
|
| # create a list of (count, ip) cells, sorted by count
| my ($ip, $count);
| while (($ip, $count) = each %hack_count)
| {
| push @report, [$count, $ip];
| }
|
| foreach (sort { $b->[0] <=> $a->[0]; } @report) {
| printf("%-50s\t%4d\n", $_->[1], $_->[0]);
| }
`----

Is there a way to combine the "each" and "push" statements so I don't
need the intermediate variables $ip and $count? The order of the
variables in the anonymous lists is not important; I just put the
numerical value first since I figured I'd remember to sort on the
first element of the anonymous lists.

Thanks.

@report = map {[$_->[1], $_->[0]]}sort {$b->[0] <=> $a->[0]} map {[$hack_count{$_}, $_]} keys %hack_count;
 
M

Michael Slass

Paul Lalli said:
Michael Slass said:
I have a hash where the keys are strings and the values are integers.
I wanted to produce a report of the name/value pairs, sorted by the
integers in descending order.

I came up with this:
| # create a list of (count, ip) cells, sorted by count
| my ($ip, $count);
| while (($ip, $count) = each %hack_count)
| {
| push @report, [$count, $ip];
| }
Is there a way to combine the "each" and "push" statements so I don't
need the intermediate variables $ip and $count?

my @report = map {[$hack_count{$_}, $_]} keys %hack_count;
# OR . . .
my @report;
push @report, [$hack_count{$_}, $_] for keys %hack_count;

Of course, these are both still using an intermediate variable of $_.
That may or may not be 'good enough' for you.

Paul Lalli

Ahh.. didn't know about map. That's perfect, thank you. I'll read
the docs about it.
 
J

John Bokma

mjl69 wrote:

[ snip ]

my @report =

map { [ $_->[ 1 ], $_->[ 0 ] ] }
sort { $b->[ 0 ] <=> $a->[ 0 ] }
map { [ $hack_count{ $_ }, $_ ] }
keys %hack_count;

Amazing how things can become more readable by careful reformating :-D
 
M

mjl69

mjl69 wrote:

[ snip ]

my @report =

map { [ $_->[ 1 ], $_->[ 0 ] ] }
sort { $b->[ 0 ] <=> $a->[ 0 ] }
map { [ $hack_count{ $_ }, $_ ] }
keys %hack_count;

Amazing how things can become more readable by careful reformating :-D

I was just looking for a place to use the Schwartzian Transform. It is kind of a stretch. He did not really need the elements of the anonymous array reversed at the end (the beginning).

mjl
 
A

Anno Siegel

mjl69 said:
mjl69 wrote:

[ snip ]

my @report =

map { [ $_->[ 1 ], $_->[ 0 ] ] }
sort { $b->[ 0 ] <=> $a->[ 0 ] }
map { [ $hack_count{ $_ }, $_ ] }
keys %hack_count;

Amazing how things can become more readable by careful reformating :-D

I was just looking for a place to use the Schwartzian Transform. It is
kind of a stretch. He did not really need the elements of the anonymous
array reversed at the end (the beginning).

It is not a good example to demonstrate the Schwartzian Transform. Its
point is to avoid the repeated calculation of a sort key by replacing
it with access to a Perl data structure (a list of pairs). When sorting
by the values of a hash, we already have a data structure that gives
fast access to the sort keys -- the hash itself. There is little
point in building another for sorting. That the OP *wants* a very
similar data structure is a different consideration.

On the other hand, the Schwartzian transform doesn't hinge on he fact
that the sort key is the first element of each pair. Building the
wanted structure from the start and indexing on 1 wouldn't make it
less of one. The final (top) step is unnecessary and missing now,
so it's only half of a Schwartzian still.

sort { $b->[ 1] <=> $a->[ 1] }
map [ $_, $hack_count{ $_}],
keys %hack_count;

Anno
 
J

John Bokma

Anno said:

Post increment is more used, and hence more natural to read. I only use pre
increment when it does matter. If I see ++$var I wonder what is happening
there. Like another example I posted last week somewhere else:

(pseudocode)

i = 0
while (true) {


i++;
last if i >= 10;
}

I consider >= 10 misleading if the previous statement is the only one that
modifies i. It can't get > 10 , so why test for it?
 
J

John Bokma

Anno said:
less of one. The final (top) step is unnecessary and missing now,
so it's only half of a Schwartzian still.

sort { $b->[ 1] <=> $a->[ 1] }
map [ $_, $hack_count{ $_}],
keys %hack_count;

An Artzian transform :-D.

BTW, why a space before the 1 and not after? I use for some time a lot of
white space, ie:

sort { $b->[ 1 ] <=> $a->[ 1 ] }
map [ $_, $hack_count{ $_ } ],
keys %hack_count;

(just curious) I might even line up a bit, e.g.

sort { $b->[ 1 ] <=> $a->[ 1 ] }
map [ $_, $hack_count{ $_ } ],
keys %hack_count;

(probably has a lot to do with using a 10pt font :) ).
 
A

Anno Siegel

John Bokma said:
Post increment is more used, and hence more natural to read. I only use pre
increment when it does matter. If I see ++$var I wonder what is happening
there.

It is more frequently used, but for no good reason. Pre-increment is
the simpler process, not only implementationally. There are two values
involved in $x++, but only one in ++$x. I'm not very consequent about it,
but I'm trying to give "++ $x" a better chance.
Like another example I posted last week somewhere else:
(pseudocode)

i = 0
while (true) {


i++;
last if i >= 10;
}

I consider >= 10 misleading if the previous statement is the only one that
modifies i. It can't get > 10 , so why test for it?

That hinges entirely on the clause "... only (statement) that modifies i",
amended by "and there never will be another one". I take ">=" comparison
of loop indices for granted as a bit of defensive programming, it doesn't
confuse me.

Anno
 
A

Anno Siegel

John Bokma said:
Anno said:
less of one. The final (top) step is unnecessary and missing now,
so it's only half of a Schwartzian still.

sort { $b->[ 1] <=> $a->[ 1] }
map [ $_, $hack_count{ $_}],
keys %hack_count;

An Artzian transform :-D.

BTW, why a space before the 1 and not after? I use for some time a lot of
white space, ie:

sort { $b->[ 1 ] <=> $a->[ 1 ] }
map [ $_, $hack_count{ $_ } ],
keys %hack_count;

(just curious) I might even line up a bit, e.g.

I lump together closing brackets of all kind, unless I want one to stand
out. You don't want to miss an opening bracket, but the closing ones only
confirm what you know. A lump of them can be parsed as "this closes
everything" (up to a point that can be indicated through indentation).

I have carried this over from Lisp (where it's standard, though with only
one kind of bracket), but it works with most languages. It is also
kind of a personal hallmark. I know it is not particularly pretty,
especially in short bits of code.

Anno
 
R

Rasto Levrinc

John said:
there. Like another example I posted last week somewhere else:

(pseudocode)

i = 0
while (true) {


i++;
last if i >= 10;
}

I consider >= 10 misleading if the previous statement is the only one that
modifies i. It can't get > 10 , so why test for it?

The reason for that is, that if there is some obscure bug in the code,
it is less likely, that the program will run in endless loop. I was told
so, maybe 10 years ago and I am doing it like this ever since, without
much thinking about this. But probably it was never needed. :)
 
J

John Bokma

Anno said:
It is more frequently used, but for no good reason. Pre-increment is
the simpler process, not only implementationally. There are two
values involved in $x++, but only one in ++$x.

A programmer shouldn't be bothered by that in normal cases.
I'm not very
consequent about it, but I'm trying to give "++ $x" a better chance.

To me that sounds like sacrifycing readability.
That hinges entirely on the clause "... only (statement) that modifies
i", amended by "and there never will be another one".
Yup.

I take ">="
comparison of loop indices for granted as a bit of defensive
programming,

So you prefer your program silently to move on?
 
J

John Bokma

Rasto said:
The reason for that is, that if there is some obscure bug in the code,
it is less likely, that the program will run in endless loop.

And more likely to move on silently. IMNSHO this is wrong.
I was
told so, maybe 10 years ago and I am doing it like this ever since,
without much thinking about this. But probably it was never needed.

Think again about it, and forgot what someone told to you 10 years ago.
IMNSHO programming to work around mistakes in your code is wrong.

Also, it confuses people. Do you:

a = 10
:
: some code that doesn't do anything with a
:
if ( a != 10 ) a = 10 just in case

I hope not.
 
J

John Bokma

Abigail said:
John Bokma ([email protected]) wrote on MMMMCLXXVI September
MCMXCIII in <URL:][ Anno Siegel wrote:
][
][ > sort { $b->[ 1] <=> $a->[ 1] }
][
][ BTW, why a space before the 1 and not after? I use for some time a
]lot of [ white space, ie:


The two lines above contain two commas, a question mark and a colon.
Each of them are followed by space, but not preceeded by any.

Why?

Because it's programmed in English, which has a style guide that recommends
not putting spaces in front of commas, question marks etc.
 
J

John Bokma

Anno said:
John Bokma said:
Anno said:
less of one. The final (top) step is unnecessary and missing now,
so it's only half of a Schwartzian still.

sort { $b->[ 1] <=> $a->[ 1] }
map [ $_, $hack_count{ $_}],
keys %hack_count;

An Artzian transform :-D.

BTW, why a space before the 1 and not after? I use for some time a
lot of white space, ie:

sort { $b->[ 1 ] <=> $a->[ 1 ] }
map [ $_, $hack_count{ $_ } ],
keys %hack_count;

(just curious) I might even line up a bit, e.g.

I lump together closing brackets of all kind, unless I want one to
stand out. You don't want to miss an opening bracket, but the closing
ones only confirm what you know. A lump of them can be parsed as
"this closes everything" (up to a point that can be indicated through
indentation).

Ok, so the first line should have be:

sort { $b->[ 1] <=> $a->[ 1]}

?
I have carried this over from Lisp (where it's standard, though with
only one kind of bracket), but it works with most languages. It is
also kind of a personal hallmark. I know it is not particularly
pretty, especially in short bits of code.

I understand the idea, but most editors do bracket matching. And in
cases when it causes problem, I break things a bit more up, for example:

sort {

$b->[ 1 ] <=> $a->[ 1 ]

} map [

$_,
$hack_count{ $_ }

] keys %hack_count;
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top