Script that gives instance count unique patterns in a sorted file

Generic Usenet Account · May 4, 2006

I had a need for a script that operates on a text file, sorted by
lines, which gives the instance count for each unique pattern in the
file. Using my limited shell-scripting capabilities, I have come up
with something which works. However, I am keen to know how the real
"pros" will approach this problem. My source code follows.

Thanks,
Song

/////////////////////////////////////////////////////////////////////////
#!/bin/ksh
#
# Counts number of repetitions
#
#set -x

USAGE_STR="$0 <file-name>"

function countRepetitions
{
if [ "$#" -ne "0" ]
then
prev=""
count=0
firstTime=1

while read current
do
if [ "$firstTime" -eq "1" ]
then
firstTime=0
prev=$current
else
count=`expr $count + 1`
fi

if [ "$current" != "$prev" ]
then
echo "$prev" : "$count"
count=0
prev=$current
fi
done < $1
fi
}

if [ "$#" == "0" ]
then
echo $USAGE_STR
exit 1
else
if [ ! -f "$1" ]
then
echo "Error: Cannot open file $1"
echo $USAGE_STR
exit 2
fi
fi

countRepetitions $1
exit 0

Uri Guttman · May 4, 2006

that is some of the worst perl code i have ever seen. it won't even
compile. come back when your code compiles cleanly.

uri

Denver · May 4, 2006

Uri said:
that is some of the worst perl code i have ever seen.

He multiply cross-posted the message.
It is not perl at all.

Uri Guttman · May 4, 2006

MB> I don't think the OP was posting perl; it was a shell script.

Duh!!!

uri

Paul Lalli · May 4, 2006

Marc said:
I don't think the OP was posting perl; it was a shell script.

Your sarcasm meeter is pretty out of whack. You might want to reboot
it.

Paul Lalli

Paul Lalli · May 4, 2006

Denver said:
He multiply cross-posted the message.
It is not perl at all.

No kidding.

Posting a shell script to a Perl newsgroup is not made acceptable by
simply cross-posting the same shell script to a group for which it
actually *is* on topic.

Paul Lalli

Chris F.A. Johnson · May 4, 2006

I had a need for a script that operates on a text file, sorted by
lines, which gives the instance count for each unique pattern in the
file. Using my limited shell-scripting capabilities, I have come up
with something which works. However, I am keen to know how the real
"pros" will approach this problem.

uniq -c file

Ala Qumsieh · May 4, 2006

Jim said:
#!/usr/local/bin/perl
#
use strict;
use warnings;
my $usage = "Usage: $0 <file-name>\n";
print ($usage) and exit unless @ARGV;
print +("Cannot open file $ARGV[0]\n$usage")
and exit unless -f $ARGV[0];
my @lines = <>;
chomp(@lines);
my %count;
$count{$_}++ for @lines;
foreach my $key ( sort keys %count ) {
printf "%s: %d\n", $key, $count{$key};
}

Perl golfers can no doubt reduce this further.

First attempt:

perl -ple '$x{$_}++}{$_=join$/,map"$_: $x{$_}",keys%x' file_name

slightly shorter:

perl -nle '$x{$_}++}{print"$_: $x{$_}"for keys%x' file_name

--Ala

Anno Siegel · May 4, 2006

Jim Gibson said:
[...]

Here is a simple re-write in Perl:

#!/usr/local/bin/perl
#
use strict;
use warnings;
my $usage = "Usage: $0 <file-name>\n";
print ($usage) and exit unless @ARGV;
print +("Cannot open file $ARGV[0]\n$usage")
and exit unless -f $ARGV[0];
my $prev='';
my $count = 0;
while(my $line = <>) {
chomp($line);
if( $prev eq $line ) {
$count++;
}else{
print "$prev: $count\n" if $count;
$count = 1;
}
$prev = $line;
}
print "$prev: $count\n" if $count;

Here is a slightly shorter version that doesn't require the file be
sorted:

#!/usr/local/bin/perl
#
use strict;
use warnings;
my $usage = "Usage: $0 <file-name>\n";
print ($usage) and exit unless @ARGV;
print +("Cannot open file $ARGV[0]\n$usage")
and exit unless -f $ARGV[0];
my @lines = <>;
chomp(@lines);
my %count;
$count{$_}++ for @lines;
foreach my $key ( sort keys %count ) {
printf "%s: %d\n", $key, $count{$key};
}

Perl golfers can no doubt reduce this further.

That prints the counts in sorted order, not in the order they appear in
as the original.

chomp( my @lines = <>);
my %count;
printf "%s: %d\n", $_, $count{$_} for grep ! $count{ $_} ++, @lines;

Anno

William James · May 4, 2006

Generic said:
I had a need for a script that operates on a text file, sorted by
lines, which gives the instance count for each unique pattern in the
file. Using my limited shell-scripting capabilities, I have come up
with something which works. However, I am keen to know how the real
"pros" will approach this problem. My source code follows.

Thanks,
Song

/////////////////////////////////////////////////////////////////////////
#!/bin/ksh
#
# Counts number of repetitions
#
#set -x

USAGE_STR="$0 <file-name>"

function countRepetitions
{
if [ "$#" -ne "0" ]
then
prev=""
count=0
firstTime=1

while read current
do
if [ "$firstTime" -eq "1" ]
then
firstTime=0
prev=$current
else
count=`expr $count + 1`
fi

if [ "$current" != "$prev" ]
then
echo "$prev" : "$count"
count=0
prev=$current
fi
done < $1
fi
}

if [ "$#" == "0" ]
then
echo $USAGE_STR
exit 1
else
if [ ! -f "$1" ]
then
echo "Error: Cannot open file $1"
echo $USAGE_STR
exit 2
fi
fi

countRepetitions $1
exit 0

Awk:

1 == NR { prev = $0 }
$0 != prev { print prev, count ; count = 0 ; prev = $0 }
{ count++ }
END { print prev, count }

Ruby:

ary = [] ; hsh = Hash.new(0)
while s = gets
ary |= [ s ]
hsh[ s ] += 1
end
ary.each{|x| print x.chomp, " ", hsh[x], "\n" }

FAQ 5.3 How do I count the number of lines in a file?	0	Jan 31, 2011
Run python script with ./	0	Apr 5, 2013
unique elements in a list....	9	Apr 17, 2006
Trouble with a file upload script	2	Mar 15, 2007
ksh after perl script	7	Oct 4, 2005
In a Perl script 'exit 1' returns exit status 0!	8	Dec 8, 2006
Finding all the links in a Unix file/directory path	3	May 12, 2009
calling Korn shell from Perl script - changing oracle db passwords through web	1	Oct 23, 2003

Script that gives instance count unique patterns in a sorted file

Generic Usenet Account

Uri Guttman

Denver

Uri Guttman

Paul Lalli

Paul Lalli

Chris F.A. Johnson

Ala Qumsieh

Anno Siegel

William James

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads