collections.Counter surprisingly slow

Roy Smith · Jul 28, 2013

I've been doing an informal "intro to Python" lunchtime series for some
co-workers (who are all experienced programmers, in other languages).
This week I was going to cover list comprehensions, exceptions, and
profiling. So, I did a little demo showing different ways to build a
dictionary counting how many times a string appears in some input:

test() implements a "look before you leap" python loop

exception() uses a try/catch construct in a similar python loop

default() uses a defaultdict

count() uses a Counter

I profiled it, to show how the profiler works. The full code is below.
The input is an 8.8 Mbyte file containing about 570,000 lines (11,000
unique strings). Python 2.7.3 on Ubuntu Precise.

As I expected, test() is slower than exception(), which is slower than
default(). I'm rather shocked to discover that count() is the slowest
of all! I expected it to be the fastest. Or, certainly, no slower
than default().

The full profiler dump is at the end of this message, but the gist of
it is:

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.322 0.322 ./stations.py:42(count)
1 0.159 0.159 0.159 0.159 ./stations.py:17(test)
1 0.114 0.114 0.114 0.114 ./stations.py:27(exception)
1 0.097 0.097 0.097 0.097 ./stations.py:36(default)

Why is count() [i.e. collections.Counter] so slow?

-------------------------------------------------------------
from collections import defaultdict, Counter

def main():
lines = open('stations.txt').readlines()

d1 = test(lines)
d2 = exception(lines)
d3 = default(lines)
d4 = count(lines)

print d1 == d2
print d1 == d3
print d1 == d4

def test(lines):
d = {}
for station in lines:
if station in d:
d[station] += 1
else:
d[station] = 1
return d

def exception(lines):
d = {}
for station in lines:
try:
d[station] += 1
except KeyError:
d[station] = 1
return d

def default(lines):
d = defaultdict(int)
for station in lines:
d[station] += 1
return d

def count(lines):
d = Counter(lines)
return d

if __name__ == '__main__':
import cProfile
import pstats
cProfile.run('main()', 'stations.stats')
p = pstats.Stats('stations.stats')
p.sort_stats('cumulative').print_stats()
-------------------------------------------------------------

570335 function calls (570332 primitive calls) in 0.776 seconds

Ordered by: cumulative time

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.023 0.023 0.776 0.776 <string>:1(<module>)
1 0.006 0.006 0.753 0.753 ./stations.py:5(main)
1 0.000 0.000 0.322 0.322 ./stations.py:42(count)
1 0.000 0.000 0.322 0.322 /usr/lib/python2.7/collections.py:407(__init__)
1 0.242 0.242 0.322 0.322 /usr/lib/python2.7/collections.py:470(update)
1 0.159 0.159 0.159 0.159 ./stations.py:17(test)
1 0.114 0.114 0.114 0.114 ./stations.py:27(exception)
1 0.097 0.097 0.097 0.097 ./stations.py:36(default)
570285 0.080 0.000 0.080 0.000 {method 'get' of 'dict' objects}
1 0.055 0.055 0.055 0.055 {method 'readlines' of 'file' objects}
1 0.000 0.000 0.000 0.000 {isinstance}
1 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/abc.py:128(__instancecheck__)
2/1 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/abc.py:148(__subclasscheck__)
3/1 0.000 0.000 0.000 0.000 {issubclass}
4 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:58(__iter__)
1 0.000 0.000 0.000 0.000 {open}
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:26(__exit__)
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:20(__enter__)
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:36(__init__)
3 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:68(__contains__)
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:52(_commit_removals)
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:81(add)
3 0.000 0.000 0.000 0.000 {getattr}
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_weakrefset.py:16(__init__)
2 0.000 0.000 0.000 0.000 {method 'remove' of 'set' objects}
2 0.000 0.000 0.000 0.000 {method '__subclasses__' of 'type' objects}
2 0.000 0.000 0.000 0.000 /home/roy/deploy/current/python/lib/python2.7/_abcoll.py:97(__subclasshook__)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
4 0.000 0.000 0.000 0.000 {method 'add' of 'set' objects}

Steven D'Aprano · Jul 28, 2013

On Sun, 28 Jul 2013 15:59:04 -0400, Roy Smith wrote:

[...]

I'm rather shocked to discover that count() is the slowest
of all! I expected it to be the fastest. Or, certainly, no slower than
default().

The full profiler dump is at the end of this message, but the gist of it
is:

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.322 0.322 ./stations.py:42(count)
1 0.159 0.159 0.159 0.159 ./stations.py:17(test)
1 0.114 0.114 0.114 0.114 ./stations.py:27(exception)
1 0.097 0.097 0.097 0.097 ./stations.py:36(default)

Why is count() [i.e. collections.Counter] so slow?

It's within a factor of 2 of test, and 3 of exception or default (give or
take). I don't think that's surprisingly slow. In 2.7, Counter is written
in Python, while defaultdict has an accelerated C version. I expect that
has something to do with it.

Calling Counter ends up calling essentially this code:

for elem in iterable:
self[elem] = self.get(elem, 0) + 1

(although micro-optimized), where "iterable" is your data (lines).
Calling the get method has higher overhead than dict[key], that will also
contribute.

Roy Smith · Jul 28, 2013

Why is count() [i.e. collections.Counter] so slow?

It's within a factor of 2 of test, and 3 of exception or default (give or
take). I don't think that's surprisingly slow.[/QUOTE]

It is for a module which describes itself as "High-performance container
datatypes"

Serhiy Storchaka · Jul 29, 2013

28.07.13 22:59, Roy Smith Ð½Ð°Ð¿Ð¸ÑÐ°Ð²(Ð»Ð°):

The input is an 8.8 Mbyte file containing about 570,000 lines (11,000
unique strings).

Repeat you tests with totally unique lines.

The full profiler dump is at the end of this message, but the gist of
it is:

Profiler affects execution time. In particular it slowdown Counter
implementation which uses more function calls. For real world
measurement use different approach.

Why is count() [i.e. collections.Counter] so slow?

Feel free to contribute a patch which fixes this "wart". Note that
Counter shouldn't be slowdowned on mostly unique data.

Stefan Behnel · Jul 29, 2013

Steven D'Aprano, 28.07.2013 22:51:

Calling Counter ends up calling essentially this code:

for elem in iterable:
self[elem] = self.get(elem, 0) + 1

(although micro-optimized), where "iterable" is your data (lines).
Calling the get method has higher overhead than dict[key], that will also
contribute.

It comes with a C accelerator (at least in Py3.4dev), but it seems like
that stumbles a bit over its own feet. The accelerator function special
cases the (exact) dict type, but the Counter class is a subtype of dict and
thus takes the generic path, which makes it benefit a bit less than possible.

Look for _count_elements() in

http://hg.python.org/cpython/file/tip/Modules/_collectionsmodule.c

Nevertheless, even the generic C code path looks fast enough in general. I
think the problem is just that the OP used Python 2.7, which doesn't have
this accelerator function.

Stefan

Joshua Landau · Jul 29, 2013

28.07.13 22:59, Roy Smith Ð½Ð°Ð¿Ð¸ÑÐ°Ð²(Ð»Ð°):

The input is an 8.8 Mbyte file containing about 570,000 lines (11,000

Repeat you tests with totally unique lines.

Counter is about Â½ the speed of defaultdict in that case (as opposed to â…“).

The full profiler dump is at the end of this message, but the gist of

Profiler affects execution time. In particular it slowdown Counter
implementation which uses more function calls. For real world measurement
use different approach.

Doing some re-times, it seems that his originals for defaultdict, exception
and Counter were about right. I haven't timed the other.

Why is count() [i.e. collections.Counter] so slow?
Feel free to contribute a patch which fixes this "wart". Note that Counter
shouldn't be slowdowned on mostly unique data.

I find it hard to agree that counter should be optimised for the
unique-data case, as surely it's much more oft used when there's a point to
counting?

Also, couldn't Counter just extend from defaultdict?

Joshua Landau · Jul 29, 2013

Steven D'Aprano, 28.07.2013 22:51:

Calling Counter ends up calling essentially this code:

for elem in iterable:
self[elem] = self.get(elem, 0) + 1

(although micro-optimized), where "iterable" is your data (lines).
Calling the get method has higher overhead than dict[key], that will also
contribute.

Click to expand...

It comes with a C accelerator (at least in Py3.4dev), but it seems like
that stumbles a bit over its own feet. The accelerator function special
cases the (exact) dict type, but the Counter class is a subtype of dict and
thus takes the generic path, which makes it benefit a bit less than
possible.

Look for _count_elements() in

http://hg.python.org/cpython/file/tip/Modules/_collectionsmodule.c

Nevertheless, even the generic C code path looks fast enough in general. I
think the problem is just that the OP used Python 2.7, which doesn't have
this accelerator function.

# _count_elements({}, items), _count_elements(dict_subclass(), items),
Counter(items), defaultdict(int) loop with exception handling
# "items" is always 1m long with varying levels of repetition
.... helper.timeit(1), helper_subclass.timeit(1), counter.timeit(1),
default.timeit(1)
....
(0.18816172199876746, 0.4679023139997298, 0.9684444869999425,
0.33518486200046027)
(0.2936601179990248, 0.6056111739999324, 1.1316078849995392,
0.46283868699902087)
(0.35396358400066674, 0.685048443998312, 1.2120939880005608,
0.5497965239992482)
(0.5337620789996436, 0.8658702100001392, 1.4507492869997805,
0.7772859329998028)
(0.745282343999861, 1.1455801379997865, 2.116569702000561,
1.3293145009993168)

I have the helper but Counter is still slow. Is it not getting used for
some reason? It's not even as fast as helper on a dict's (direct, no
overridden methods) subclass.

Ian Kelly · Jul 29, 2013

Also, couldn't Counter just extend from defaultdict?

It could, but I expect the C helper function in 3.4 will be faster
since it doesn't even need to call __missing__ in the first place.
And the cost (both in terms of maintenance and run-time) of adding
defaultdict to the Counter MRO just to reuse one tiny __missing__
method seems questionable, especially after taking into account that
the constructor signature is incompatible.

Serhiy Storchaka · Jul 29, 2013

29.07.13 20:19, Ian Kelly Ð½Ð°Ð¿Ð¸ÑÐ°Ð²(Ð»Ð°):

It could, but I expect the C helper function in 3.4 will be faster
since it doesn't even need to call __missing__ in the first place.

I'm surprised, but the Counter constructor with commented out import of
this accelerator is faster (at least for some data).

Stefan Behnel · Jul 30, 2013

Serhiy Storchaka, 29.07.2013 21:37:

29.07.13 20:19, Ian Kelly Ð½Ð°Ð¿Ð¸ÑÐ°Ð²(Ð»Ð°):

I'm surprised, but the Counter constructor with commented out import of
this accelerator is faster (at least for some data).

Read my post. The accelerator doesn't take the fast path for dicts as
Counter is only a subtype of dict, not exactly a dict. That means that it
raises and catches a KeyError exception for each new value that it finds,
and that is apparently more costly than the overhead of calling get().

So, my expectation is that it's faster for highly repetitive data and
slower for mostly unique data.

Maybe a "fast_dict_lookup" option for the accelerator that forces the fast
path would fix this. The Counter class, just like many (most?) other
subtypes of dict, definitely doesn't need the fallback behaviour.

Stefan

Stefan Behnel · Jul 30, 2013

Stefan Behnel, 30.07.2013 08:39:

Serhiy Storchaka, 29.07.2013 21:37:

Read my post. The accelerator doesn't take the fast path for dicts as
Counter is only a subtype of dict, not exactly a dict. That means that it
raises and catches a KeyError exception for each new value that it finds,
and that is apparently more costly than the overhead of calling get().

So, my expectation is that it's faster for highly repetitive data and
slower for mostly unique data.

Maybe a "fast_dict_lookup" option for the accelerator that forces the fast
path would fix this. The Counter class, just like many (most?) other
subtypes of dict, definitely doesn't need the fallback behaviour.

Or rather drop the fallback path completely. It's not worth having code
duplication if it's not predictable up-front (before looking at the data)
if it will help or not.

http://bugs.python.org/issue18594

Stefan

Serhiy Storchaka · Jul 30, 2013

29.07.13 14:49, Joshua Landau Ð½Ð°Ð¿Ð¸ÑÐ°Ð²(Ð»Ð°):

I find it hard to agree that counter should be optimised for the
unique-data case, as surely it's much more oft used when there's a point
to counting?

Different methods are faster for different data. LBYL approach is best
for the mostly unique data case, while EAFP approach is best for the
mostly repeated data case. In general case a performance of particular
method is a function of its performances in this two extreme cases. When
it slow for one of extreme case it can be slow in a number of
intermediate cases.

Also, couldn't Counter just extend from defaultdict?

Unfortunately this only will slowdown it.

A problem about ipython	2	Apr 1, 2011
definition of sub-functions in the hotshot profiler	0	Mar 23, 2006
Python 2.3.3 : Win32 build vs Cygwin build performance ?	3	Jan 27, 2004
Performance profiling Python code	2	Mar 24, 2006
tornado.web ioloop add_timeout eats CPU	3	Sep 3, 2012
Speed ain't bad	14	Dec 31, 2004
Weird import failure with "nosetests --processes=1"	2	Nov 29, 2012
nosetests vs. relative imports?	1	Dec 25, 2013

collections.Counter surprisingly slow

Roy Smith

Steven D'Aprano

Roy Smith

Serhiy Storchaka

Stefan Behnel

Joshua Landau

Joshua Landau

Ian Kelly

Serhiy Storchaka

Stefan Behnel

Stefan Behnel

Serhiy Storchaka

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads