FAQ 6.15 How can I print out a word-frequency or line-frequency summary?

PerlFAQ Server · Feb 1, 2011

This is an excerpt from the latest version perlfaq6.pod, which
comes with the standard Perl distribution. These postings aim to
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

6.15: How can I print out a word-frequency or line-frequency summary?

To do this, you have to parse out each word in the input stream. We'll
pretend that by word you mean chunk of alphabetics, hyphens, or
apostrophes, rather than the non-whitespace chunk idea of a word given
in the previous question:

while (<>) {
while ( /(\b[^\W_\d][\w'-]+\b)/g ) { # misses "`sheep'"
$seen{$1}++;
}
}

while ( ($word, $count) = each %seen ) {
print "$count $word\n";
}

If you wanted to do the same thing for lines, you wouldn't need a
regular expression:

while (<>) {
$seen{$_}++;
}

while ( ($line, $count) = each %seen ) {
print "$count $line";
}

If you want these output in a sorted order, see perlfaq4: "How do I sort
a hash (optionally by value instead of key)?".

--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in
perlfaq.pod.

FAQ 6.14 How do I process each word on each line?	0	Apr 8, 2011
FAQ 4.75 How do I print out or copy a recursive data structure?	0	Jan 4, 2011
FAQ 8.4 How do I print something out in color?	0	Apr 21, 2011
FAQ 4.29 How can I count the number of occurrences of a substring within a string?	0	Jan 4, 2011
FAQ 4.27 How can I access or change N characters of a string?	0	Feb 27, 2011
FAQ 5.8 How can I manipulate fixed-record-length files?	0	Apr 16, 2011
FAQ 6.3 How can I pull out lines between two patterns that are themselves on different lines?	0	Jan 14, 2011
FAQ 4.41 How can I remove duplicate elements from a list or array?	0	Mar 1, 2011

FAQ 6.15 How can I print out a word-frequency or line-frequency summary?

PerlFAQ Server

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads