Script that gives instance count unique patterns in a sorted file

  • Thread starter Generic Usenet Account
  • Start date
G

Generic Usenet Account

I had a need for a script that operates on a text file, sorted by
lines, which gives the instance count for each unique pattern in the
file. Using my limited shell-scripting capabilities, I have come up
with something which works. However, I am keen to know how the real
"pros" will approach this problem. My source code follows.

Thanks,
Song

/////////////////////////////////////////////////////////////////////////
#!/bin/ksh
#
# Counts number of repetitions
#
#set -x

USAGE_STR="$0 <file-name>"

function countRepetitions
{
if [ "$#" -ne "0" ]
then
prev=""
count=0
firstTime=1

while read current
do
if [ "$firstTime" -eq "1" ]
then
firstTime=0
prev=$current
else
count=`expr $count + 1`
fi

if [ "$current" != "$prev" ]
then
echo "$prev" : "$count"
count=0
prev=$current
fi
done < $1
fi
}


if [ "$#" == "0" ]
then
echo $USAGE_STR
exit 1
else
if [ ! -f "$1" ]
then
echo "Error: Cannot open file $1"
echo $USAGE_STR
exit 2
fi
fi

countRepetitions $1
exit 0
 
U

Uri Guttman

that is some of the worst perl code i have ever seen. it won't even
compile. come back when your code compiles cleanly.

uri
 
U

Uri Guttman

MB> I don't think the OP was posting perl; it was a shell script.

Duh!!!

uri
 
P

Paul Lalli

Marc said:
I don't think the OP was posting perl; it was a shell script.

Your sarcasm meeter is pretty out of whack. You might want to reboot
it.

Paul Lalli
 
P

Paul Lalli

Denver said:
He multiply cross-posted the message.
It is not perl at all.

No kidding.

Posting a shell script to a Perl newsgroup is not made acceptable by
simply cross-posting the same shell script to a group for which it
actually *is* on topic.

Paul Lalli
 
C

Chris F.A. Johnson

I had a need for a script that operates on a text file, sorted by
lines, which gives the instance count for each unique pattern in the
file. Using my limited shell-scripting capabilities, I have come up
with something which works. However, I am keen to know how the real
"pros" will approach this problem.

uniq -c file
 
A

Ala Qumsieh

Jim said:
#!/usr/local/bin/perl
#
use strict;
use warnings;
my $usage = "Usage: $0 <file-name>\n";
print ($usage) and exit unless @ARGV;
print +("Cannot open file $ARGV[0]\n$usage")
and exit unless -f $ARGV[0];
my @lines = <>;
chomp(@lines);
my %count;
$count{$_}++ for @lines;
foreach my $key ( sort keys %count ) {
printf "%s: %d\n", $key, $count{$key};
}


Perl golfers can no doubt reduce this further.

First attempt:

perl -ple '$x{$_}++}{$_=join$/,map"$_: $x{$_}",keys%x' file_name

slightly shorter:

perl -nle '$x{$_}++}{print"$_: $x{$_}"for keys%x' file_name

--Ala
 
A

Anno Siegel

Jim Gibson said:
[...]

Here is a simple re-write in Perl:

#!/usr/local/bin/perl
#
use strict;
use warnings;
my $usage = "Usage: $0 <file-name>\n";
print ($usage) and exit unless @ARGV;
print +("Cannot open file $ARGV[0]\n$usage")
and exit unless -f $ARGV[0];
my $prev='';
my $count = 0;
while(my $line = <>) {
chomp($line);
if( $prev eq $line ) {
$count++;
}else{
print "$prev: $count\n" if $count;
$count = 1;
}
$prev = $line;
}
print "$prev: $count\n" if $count;
Here is a slightly shorter version that doesn't require the file be
sorted:

#!/usr/local/bin/perl
#
use strict;
use warnings;
my $usage = "Usage: $0 <file-name>\n";
print ($usage) and exit unless @ARGV;
print +("Cannot open file $ARGV[0]\n$usage")
and exit unless -f $ARGV[0];
my @lines = <>;
chomp(@lines);
my %count;
$count{$_}++ for @lines;
foreach my $key ( sort keys %count ) {
printf "%s: %d\n", $key, $count{$key};
}


Perl golfers can no doubt reduce this further.

That prints the counts in sorted order, not in the order they appear in
as the original.

chomp( my @lines = <>);
my %count;
printf "%s: %d\n", $_, $count{$_} for grep ! $count{ $_} ++, @lines;

Anno
 
W

William James

Generic said:
I had a need for a script that operates on a text file, sorted by
lines, which gives the instance count for each unique pattern in the
file. Using my limited shell-scripting capabilities, I have come up
with something which works. However, I am keen to know how the real
"pros" will approach this problem. My source code follows.

Thanks,
Song

/////////////////////////////////////////////////////////////////////////
#!/bin/ksh
#
# Counts number of repetitions
#
#set -x

USAGE_STR="$0 <file-name>"

function countRepetitions
{
if [ "$#" -ne "0" ]
then
prev=""
count=0
firstTime=1

while read current
do
if [ "$firstTime" -eq "1" ]
then
firstTime=0
prev=$current
else
count=`expr $count + 1`
fi

if [ "$current" != "$prev" ]
then
echo "$prev" : "$count"
count=0
prev=$current
fi
done < $1
fi
}


if [ "$#" == "0" ]
then
echo $USAGE_STR
exit 1
else
if [ ! -f "$1" ]
then
echo "Error: Cannot open file $1"
echo $USAGE_STR
exit 2
fi
fi

countRepetitions $1
exit 0

Awk:

1 == NR { prev = $0 }
$0 != prev { print prev, count ; count = 0 ; prev = $0 }
{ count++ }
END { print prev, count }

Ruby:

ary = [] ; hsh = Hash.new(0)
while s = gets
ary |= [ s ]
hsh[ s ] += 1
end
ary.each{|x| print x.chomp, " ", hsh[x], "\n" }
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,277
Latest member
VytoKetoReview

Latest Threads

Top