Pre-PEP: Dictionary accumulator methods

G

George Sakkis

+1 on this. The new suggested operations are meaningful for a subset
of all
approved, > it will clearly be an

It is bad OO design, George. I want to be a bit more become more
specific on this and provide an example:

Kay, the '+1' was for your post, not the pre-PEP; I fully agree with you in that it's bad design. I
just tried to play devil's advocate by citing an argument that might be used to back the addition of
the proposed accumulating methods.

Regards,
George
 
R

Ron

def count(self, value, qty=1):
try:
self[key] += qty
except KeyError:
self[key] = qty

def appendlist(self, key, *values):
try:
self[key].extend(values)
except KeyError:
self[key] = list(values)


Why is it better than this?

dict[key]+=n
dict[key]+=list

Ron
 
G

George Sakkis

Michael Spencer said:
I could imagine a class: accumulator(mapping, default, incremetor) such that:
my_tally = accumulator({}, 0, operator.add) or
my_dict_of_lists = accumulator({}, [], list.append) or
my_dict_of_sets = accumulator({}, set(), set.add)

then: .accumulate(key, value) "does the right thing" in each case.

a bit cumbersome, because of having to specify the accumulation method, but
avoids adding methods to, or sub-classing dict

Michael

That's the equivalent of reduce() for mappings. Given the current trend of moving away from
traditional functional features (lambda,map,filter,etc.), I would guess it's not likely to become
mainstream.

George
 
M

Mike Rovner

Paul said:
If the compiler can do some type inference, it can optimize out those
multiple calls pretty straightforwardly.

It can be tipped like that:

di = dict(int)
di.setdefault(0)
di[key] += 1

dl = dict(list)
dl.setdefault([])
dl.append("word")
dl.extend(mylist)

But the point is that if method not found in dict it delegated to
container type specified in constructor.

It solves dict specialization without bloating dict class and is generic.

Mike
 
P

Paul Rubin

Mike Rovner said:
It can be tipped like that:

di = dict(int)
di.setdefault(0)
di[key] += 1 ....
But the point is that if method not found in dict it delegated to
container type specified in constructor.

It solves dict specialization without bloating dict class and is generic.

Hey, I like that. I'd let the default be an optional extra arg to the
constructor:

di = dict(int, default=0)
di[key] += 1

without the setdefault. I might even add optional type checking:

di = dict(int, default=0, typecheck=True)
di[key] = 'foo' # raises TypeError
 
D

Duncan Booth

Raymond said:
The rationale is to replace the awkward and slow existing idioms for
dictionary based accumulation:

d[key] = d.get(key, 0) + qty
d.setdefault(key, []).extend(values)

How about the alternative approach of allowing the user to override the
action to be taken when accessing a non-existent key?

d.defaultValue(0)

and the accumulation becomes:

d[key] += 1

and:

d.defaultValue(function=list)

would allow a safe:

d[key].extend(values)
 
M

Matteo Dell'Amico

Raymond said:
I would like to get everyone's thoughts on two new dictionary methods:

def count(self, value, qty=1):
try:
self[key] += qty
except KeyError:
self[key] = qty

def appendlist(self, key, *values):
try:
self[key].extend(values)
except KeyError:
self[key] = list(values)

They look as a special-case to me. They don't solve the problem for
lists of sets or lists of deques for instance, not to mention other
possible user-defined containers.

defaultdicts look to me as a solution that is more elegant and solves
more problems. What is the problem with them?
 
R

Reinhold Birkenfeld

Mike said:
Paul said:
If the compiler can do some type inference, it can optimize out those
multiple calls pretty straightforwardly.

It can be tipped like that:

di = dict(int)
di.setdefault(0)
di[key] += 1

Interesting, but why do you need to give the int type to the constructor?
dl = dict(list)
dl.setdefault([])
dl.append("word")
dl.extend(mylist)

I don't quite understand that. Which dict item are you extending? Don't
you need something like

dl[key].append("word")

?

Anyway, using `setdefault' as the method name is quite confusing,
although yours is IMHO a much better behavior given the name ;)

So what about `setdefaultvalue'?

Reinhold
 
R

Reinhold Birkenfeld

John said:
No I'm not kidding -- people from some cultures have no difficulty at
all in mentally splitting up "words" like "setdefault" or the German
equivalent of "Danubesteamnavigationcompany'sdirector'swife"; others
from other cultures where agglutinisation is not quite so rife will
have extreme difficulty.

Okay - as I'm German I might be preoccupied on this matter <wink>

Reinhold
 
R

Reinhold Birkenfeld

George said:
+1 on this. The new suggested operations are meaningful for a subset of all valid dicts, so they
should not be part of the base dict API. If any version of this is approved, it will clearly be an
application of the "practicality beats purity" zen rule, and the justification for applying it in
this case instead of subclassing should better be pretty strong; so far I'm not convinced though.

So, would the `setdefaultvalue' approach be more consistent in your eyes?

Reinhold
 
R

Roose

How about the alternative approach of allowing the user to override the
action to be taken when accessing a non-existent key?

d.defaultValue(0)

I like this a lot. It makes it more clear from the code what is going on,
rather than having to figure out what the name appendlist, count, tally,
whatever, is supposed to mean. When you see the value you'll know.

It's more general, because you can support dictionaries and sets then as
well.

I think someone mentioned that it might be a problem to add another piece of
state to all dicts though. I don't know enough about the internals to say
anything about this.

setdefault gets around this by having you pass in the value every time, so
it doesn't have to store it. It's very similar, but somehow many times more
awkward.
 
K

Kay Schluehr

Duncan said:
Raymond said:
The rationale is to replace the awkward and slow existing idioms for
dictionary based accumulation:

d[key] = d.get(key, 0) + qty
d.setdefault(key, []).extend(values)

How about the alternative approach of allowing the user to override the
action to be taken when accessing a non-existent key?

d.defaultValue(0)

and the accumulation becomes:

d[key] += 1

and:

d.defaultValue(function=list)

would allow a safe:

d[key].extend(values)

+0

The best suggestion up to now. But i find this premature because it
addresses only a special aspect of typing issues which should be
disussed together with Guidos type guard proposals in a broader
context. Besides this the suggestion though feeling pythonic is still
uneven.

Why do You set

d.defaultValue(0)
d.defaultValue(function=list)

but not

d.defaultValue(0)
d.defaultValue([])

?

And why not dict(type=int), dict(type=list) instead where default
values are instantiated during object creation? A consistent pythonic
handling of all types should be envisioned not some ad hoc solutions
that go deprecated two Python releases later.

Regards Kay
 
M

Max

Paul said:
It is sort of an uncommon word. As a US English speaker I'd say it
sounds a bit old-fashioned, except when used idiomatically ("let's
tally up the posts about accumulator messages") or in nonstandard
dialect ("Hey mister tally man, tally me banana" is a song about
working on plantations in Jamaica). It may be more common in UK
English. There's an expression "tally-ho!" which had something to do
with British fox hunts, but they don't have those any more.

Has anyone _not_ heard Jeff Probst say, "I'll go tally the votes"?!

:)

--Max
 
J

Jeremy Bowers

It is bad OO design, George. I want to be a bit more become more
specific on this and provide an example:

Having thought about this for a bit, I agree it is a powerful
counter-argument and in many other languages I'd consider this a total win.

But this is Python, and frankly, I've overridden dict more than once and
violated the Liskov substitution principle without thinking twice. (Once,
oh yes, but not twice.) Of course, all the code was under my control then.

I think the tipping point for me depends on how the interfaces in Python
are going to be implemented, which I haven't dug into. If the dict class
gets an interface definition, can I subclass from dict and "cancel" (or
some other term) the interface I inherited?

If so, then this might still be OK, although if interfaces aren't going to
confuse newbies enough, wait 'till we try to explain that their code is
blowing up because they forgot to "cancel" their interface, and they
can't *really* pass their subclass in to something expecting a dict
interface. If you *can't* cancel or downgrade the interface, then I'd say
this argument is still good; dict should be kept minimal and this should
go in a subclass.
 
M

Matteo Dell'Amico

Kay said:
Why do You set

d.defaultValue(0)
d.defaultValue(function=list)

but not

d.defaultValue(0)
d.defaultValue([])

?

I think that's because you have to instantiate a different object for
each different key. Otherwise, you would instantiate just one list as a
default value for *all* default values. In other words, given:

class DefDict(dict):
def __init__(self, default):
self.default = default
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
return self.default

you'll get

In [12]: d = DefDict([])

In [13]: d[42].extend(['foo'])

In [14]: d.default
Out[14]: ['foo']

In [15]: d[10].extend(['bar'])

In [16]: d.default
Out[16]: ['foo', 'bar']

In [17]: d[10]
Out[17]: ['foo', 'bar']

In [18]: d[10] is d.default
Out[18]: True

and this isn't what you really wanted.

By the way, to really work, I think that Duncan's proposal should create
new objects when you try to access them, and to me it seems a bit
counterintuitive. Nevertheless, I'm +0 on it.
And why not dict(type=int), dict(type=list) instead where default
values are instantiated during object creation? A consistent pythonic
handling of all types should be envisioned not some ad hoc solutions
that go deprecated two Python releases later.

I don't really understand you. What should 'type' return? A callable
that returns a new default value? That's exactly what Duncan proposed
with the "function" keyword argument.
 
M

Magnus Lie Hetland

I would like to get everyone's thoughts on two new dictionary methods:

def count(self, value, qty=1):
try:
self[key] += qty
except KeyError:
self[key] = qty

Yes, yes, YES!

*Man*, this would be useful.
def appendlist(self, key, *values):
try:
self[key].extend(values)
except KeyError:
self[key] = list(values)

Woohoo! *Just* as useful.

I'd *definitely* use these.

Hot 100% sure about the names, though. (add() and append() seem like
more natural names -- but they may be confusing, considering their
other uses...)

+1 on both (possibly allowing for some naming discussion...)
 
?

=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=

I like count() and appendlist() or whatever they will be named. But I
have one question/idea:

Why does the methods have to be put in dict? Can't their be a subtype
of dict that includes those two methods? I.e.:

..histogram = counting_dict()
..for ch in text:
.. histogram.count(ch)

Then maybe some more methods can be added tailor-mode for these two
types of dicts?:

..for ch in string.ascii_letters:
.. print "Frequency of %s = %d." % (ch, histogram.freq(ch))

Or something, you get the idea.
 
D

Duncan Booth

Roose said:
I think someone mentioned that it might be a problem to add another
piece of state to all dicts though. I don't know enough about the
internals to say anything about this.

setdefault gets around this by having you pass in the value every
time, so it doesn't have to store it. It's very similar, but somehow
many times more awkward.

Another option with no storage overhead which goes part way to reducing
the awkwardness would be to provide a decorator class accessible through
dict. The decorator class would take a value or function to be used as
the default, but apart from __getitem__ would simply forward all other
methods through to the underlying dictionary.

That gives you the ability to have not one default value for a
dictionary, but many different ones: you just decorate the dictionary
anywhere you need a default and use the underlying dictionary everywhere
else.

Some code which demonstrates the principle rather than the
implementation. dictDefaultValue could be imagined as
dict.defaultValue, dictDefaultValue(d, ...) could be
d.defaultValue(...) although the actual name used needs work:
def __init__(self, d, value=_marker, function=_marker):
self.__d = d
if value is _marker:
if function is _marker:
raise TypeError, "expected either value or function argument"
self.__dv = function
else:
def defaultValue():
return value
self.__dv = defaultValue

def __getattr__(self, name):
return getattr(self.__d, name)

def __getitem__(self, name):
try:
return self.__d[name]
except KeyError:
value = self.__dv()
self.__d[name] = value
return value

def __setitem__(self, name, value):
self.__d[name] = value

d = {}
accumulator = dictDefaultValue(d, 0)
accumulator['a'] += 1
aggregator = dictDefaultValue(d, function=list)
aggregator['b'].append('hello')
d {'a': 1, 'b': ['hello']}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,609
Members
45,253
Latest member
BlytheFant

Latest Threads

Top