Pre-PEP: Dictionary accumulator methods

K

Kay Schluehr

Matteo said:
Kay said:
Why do You set

d.defaultValue(0)
d.defaultValue(function=list)

but not

d.defaultValue(0)
d.defaultValue([])

?

I think that's because you have to instantiate a different object for
each different key. Otherwise, you would instantiate just one list as a
default value for *all* default values.

Or the default value will be copied, which is not very hard either or
type(self._default)() will be called. This is all equivalent and it
does not matter ( except for performance reasons ) which way to go as
long only one is selected.

[...]
By the way, to really work, I think that Duncan's proposal should create
new objects when you try to access them, and to me it seems a bit
counterintuitive.

If the dict has a fixed semantics by applying defaultValue() and it
returns defaults instead of exceptions whenever a key is missing i.e.
behavioural invariance the client of the dict has nothing to worry
about, hasn't he?

I don't really understand you. What should 'type' return?
A callable
that returns a new default value? That's exactly what Duncan proposed
with the "function" keyword argument.

I suspect the proposal really makes sense only if the dict-values are
of the same type. Filling it with strings, custom objects and other
stuff and receiving 0 or [] or '' if a key is missing would be a
surprise - at least for me. Instantiating dict the way I proposed
indicates type-guards! This is the reason why I want to delay this
issue and discuss it in a broader context. But I'm also undecided.
Guidos Python-3000 musings are in danger to become vaporware. "Now is
better then never"... Therefore +0.

Regards Kay
 
C

Colin J. Williams

Paul said:
El Pitonero said:
What about no name at all for the scalar case:

a['hello'] += 1
a['bye'] -= 2


I like this despite the minor surprise that it works even when
a['hello'] is uninitialized.
+1
and if the value is a list:
a['hello']= [1, 2, 3]
a['hello']+= [4] # adding the brackets is a lot simpler than
typing append or extend.

Any user is free to add his/her own subclass to handle defaults.

Colin W.
 
R

Roose

Another option with no storage overhead which goes part way to reducing
the awkwardness would be to provide a decorator class accessible through
dict. The decorator class would take a value or function to be used as
the default, but apart from __getitem__ would simply forward all other
methods through to the underlying dictionary.

I'm not sure I like the decorator -- I would never use that flexibility to
have more than one default. I can't come up with any reason to ever use
that.

I think it works best as a simple subclass:

class DefaultDict(dict):

def __init__(self, default, *args, **kwargs):
dict.__init__(self, *args, **kwargs)
self.default = default

def __getitem__(self, key):
return self.setdefault(key, copy.copy(self.default))

d = DefaultDict(0)
for x in [1, 3, 1, 2, 2, 3, 3, 3, 3]:
d[x] += 1
pprint(d)

d = DefaultDict([])
for i, x in enumerate([1, 3, 1, 2, 2, 3, 3, 3, 3]):
d[x].append(i)
pprint(d)

Output:

{1: 2, 2: 2, 3: 5}
{1: [0, 2], 2: [3, 4], 3: [1, 5, 6, 7, 8]}
 
M

Matteo Dell'Amico

Kay said:
Or the default value will be copied, which is not very hard either or
type(self._default)() will be called. This is all equivalent and it
does not matter ( except for performance reasons ) which way to go as
long only one is selected.

I don't like it very much... it seems too implicit to be pythonic. Also,
it won't work with non-copyable objects, and type(42)() = 0, and getting
0 when the default is 42 looks very strange. I prefer the explicit "give
me a callable" approach.
If the dict has a fixed semantics by applying defaultValue() and it
returns defaults instead of exceptions whenever a key is missing i.e.
behavioural invariance the client of the dict has nothing to worry
about, hasn't he?

For idioms like d[foo].append('blah') to work properly, you'd have to
set the default value every time you access a variable. It can be really
strange to fill up memory only by apparently accessing values.
I suspect the proposal really makes sense only if the dict-values are
of the same type. Filling it with strings, custom objects and other
stuff and receiving 0 or [] or '' if a key is missing would be a
surprise - at least for me. Instantiating dict the way I proposed
indicates type-guards! This is the reason why I want to delay this
issue and discuss it in a broader context. But I'm also undecided.
Guidos Python-3000 musings are in danger to become vaporware. "Now is
better then never"... Therefore +0.

Having duck-typing, we can have things that have common interface but no
common type. For instance, iterables. I can imagine a list of iterables
of different types, and a default value of maybe [] or set([]).
 
A

Alexander Schmolck

Beni Cherniavsky said:
You mean giving a dictionary a default value at creation time, right?

Yes. But creating a defaultdict type with aliased content to the original
dict would also be fine by me.
Such a dictionary could be used very easily, as in <gasp>Perl::

foreach $word ( @words ) {
$d{$word}++; # default of 0 assumed, simple code!
}

</gasp>. You would like to write::

d = dict.withdefault(0) # or something
for word in words:
d[word] += 1 # again, simple code!

I agree that it's a good idea but I'm not sure the default should be specified
at creation time. The problem with that is that if you pass such a dictionary
into an unsuspecting function, it will not behave like a normal dictionary.

Have you got a specific example in mind?

Code that needlessly relies on exceptions for "normal operation" is rather
perverse IMO and I find it hard to think of other examples.
Also, this will go awry if the default is a mutable object, like ``[]`` - you
must create a new one at every access (or introduce a rule that the object is
copied every time, which I dislike).

I think copying should on by default for objects that are mutable (and
explicitly selectable via ``.withdefault(bar,copy=False)``).

Python of course doesn't have an interface to query whether something is
mutable or not (maybe something that'll eventually be fixed), but hashable
might be a first approximation. If that's too dodgy, most commonly the value
will be a builtin type anyway, so copy by default with "efficient
implementation" (i.e. doing nothing) for ints, tuples etc. ought to work fine
in practice.
And there are cases where in different points in the code operating on the
same dictionary you need different default values.

The main problem here is that the obvious .setdefault is already taken to
misnome something else. Which I guess strengthens the point for some kind of
proxy object.
So perhaps specifying the default at every point of use by creating a proxy is
cleaner::

d = {}
for word in words:
d.withdefault(0)[word] += 1
Of course, you can always create the proxy once and still pass it into an
unsuspecting function when that is actually what you mean.

Yup (I'd presumably prefer that second option for the above code).
How should a dictionary with a default value behave (wheter inherently or a
proxy)?

- ``d.__getattr__(key)`` never raises KeyError for missing keys - instead it
returns the default value and stores the value as `d.setdefult()` does.
This is needed for make code like::

d.withdefault([])[key].append(foo)

to work - there is no call of `d.__setattr__()`, so `d.__getattr__()` must
have stored it.

I'm confused -- are you refering to __getitem__/__setitem__? Even then I don't
get what you mean: __getitem__ obviously works differently, but that would be
the whole point.
- `d.__setattr__()` and `d.__delattr__()` behave normally.

s/attr/item/ and agreed.
- Should ``key in d`` return True for all keys?

No. See below.
It is desiarable to have *some* way to know whether a key is really
present. But if it returns False for missing keys, code that checks ``key
in d`` will behave differently from normally equivallent code that uses
try..except. If we use the proxy interface, we can always check on the
original dictionary object, which removes the problem.

- ``d.has_key(key)`` must do whatever we decide ``key in d`` does.

- What should ``d.get(key, [default])`` and ``d.setdefault(key, default)``
do? There is a conflict between the default of `d` and the explicitly given
default. I think consistency is better and these should pretend that `key`
is always present. But either way, there is a subtle problem here.

..setdefault ought to trump defaultdict's default. I feel that code that
operated without raising an KeyError on normal dicts should also operate the
same way on defaultdicts were possible. I'd also suspect that if you're
effectively desiring to override .setdefault's default you're up to something
dodgy.
- Of course `iter(d)`, `d.items()` and the like should only see the keys
that are really present (the alternative inventing an infinite amount of
items out of the blue is clearly bogus).

If the idea that the default should be specified in every operation (creating
a proxy) is accepted, there is a simpler and more fool-proof solution: the
ptoxy will not support anything except `__getitem__()` and `__setitem__()` at
all. Use the original dictionary for everything else. This prevents subtle
ambiguities.

Yes, that sounds like a fine solution to me -- if something goes wrong one is
at least likely to get an error immediately.

However the name .withdefault is then possibly a bit misleading -- but
..proxywithdefault is maybe a bit too long...

BTW, this scheme could also be extended to other collection types (lists and
sets, e.g.). e.g. ``l = []; l.proxywithdefault(0)[2] = 1;l `` => ``[0,0,1]``.

Whilst I think such behavior is asking for trouble if it's enabled by default
(as in Perl and Ruby, IIRC) and also lacks flexibility (as you can't specify
the fill value), occasionally it would be quite handy and I see little harm in
providing it when it's explicitly asked for.
Too specialized IMHO. You want a dictionary with any default anyway. If you
have that, what will be the benefit of a bag type?

I more thought of the bag type as an alternative to having a dictionary with
default value (the counting case occurs most frequently and conceptually it is
well modelled by a bag).

And I don't feelt that a bag type is too specialized (plausibly too
specialized for a builtin -- but not for something provided by a module). Just
because there is natural tendency to shoehorn everything into the
bread-and-butter types of some language (dicts and lists in python's case),
doesn't mean one can't overdo it, because eventually one will end up with a
semantic mess.

Anyway my current preferences would be a proxy with default value and only
__getitem__ and __setitem__ methods -- as you suggested above, but possibly
also for other collection types than just dict.

'as
 
F

Francis Girard

Hi,

I really do not like it. So -1 for me. Your two methods are very specialized
whereas the dict type is very generic. Usually, when I see something like
this in code, I can smell it's a patch to overcome some shortcomings on a
previous design, thereby making the economy of re-designing. Simply said,
that's bad programming.

After that patch to provide a solution for only two of the more common
use-cases, you are nonetheless stucked with the old solution for all the
other use-cases (what if the value type is another dictionary or some
user-made class ?).

Here's an alternate solution which I think answers all of the problems you
mentionned while being generic.

=== BEGIN SNAP

def update_or_another_great_name(self, key, createFunc, updtFunc):
try:
self[key] <<<= updtFunc(self[key])
## This is "slow" with Python "=" since the key has to be searched
## twice But the new built-in method just has to update the value the
## first time the key is found. Therefore speed should be ok.
return True
except KeyError:
self[key] = createFunc()
return false

## Now your two specialized methods can be easily written as :

## A built-in should be provided for this (if not already proposed) :
def identical(val):
return val

def count(self, key, qty=1):
self.update_or_another_great_name(key, identical,
partial(operator.add, qty))
## partial is coming from : http://www.python.org/peps/pep-0309.html
## Using only built-in function (assuming "identical") as arguments makes it
## ok for speed (I guess).

def appendlist(self, key, *values):
self.update_or_another_great_name(key,
partial(list, values),
partial(ListType.extend, X = values))
## The first "partial" usage here is an abuse just to make sure that the
## list is not actually constructed before needed. It should work.
## The second usage is more uncertain as we need to bind the arguments from
## the right. Therefore I have to use the name of the parameter and I am not
## sure if there's one. As this list is very prolific, someone might have an
## idea on how to improve this.

=== END SNAP

By using only built-in constructs, this should be fast enough. Otherwise,
optimizing these built-ins is a much more clean and sane way of thinking then
messing the API with ad-hoc propositions.

Reviewing the problems you mention :
The readability issues with the existing constructs are:

* They are awkward to teach, create, read, and review.

The method update_or_another_great_name is easy to understand, I think. But it
might not always be easy to use it efficiently with built-ins. But this is
always the case. "Recipees" can be added to show how to efficiently use the
method.
* Their wording tends to hide the real meaning (accumulation).
Solved.

* The meaning of setdefault() 's method name is not self-evident.
Solved.


The performance issues with the existing constructs are:

* They translate into many opcodes which slows them considerably.

I really don't know what will be the outcome of the solution I propose. I
certainly do not know anything about how my Python code translates into
opcodes.
* The get() idiom requires two dictionary lookups of the same key.
Solved

* The setdefault() idiom instantiates a new, empty list prior to every
Solved

call. * That new list is often not needed and is immediately discarded.
Solved

* The setdefault() idiom requires an attribute lookup for extend/append.
Solved

* The setdefault() idiom makes two function calls.

Solved

And perhaps, what you say here is also true for your two special use-cases :
For other
uses, plain Python code suffices in terms of speed, clarity, and avoiding
unnecessary instantiation of empty containers:

if key not in d:
d.key = {subkey:value}
else:
d[key][subkey] = value


Much better than adding special cases on a generic class. Special cases always
demultiply and if we open the door ....

Regards,

Francis Girard


Le samedi 19 Mars 2005 02:24, Raymond Hettinger a écrit :
I would like to get everyone's thoughts on two new dictionary methods:

def count(self, value, qty=1):
try:
self[key] += qty
except KeyError:
self[key] = qty

def appendlist(self, key, *values):
try:
self[key].extend(values)
except KeyError:
self[key] = list(values)

The rationale is to replace the awkward and slow existing idioms for
dictionary based accumulation:

d[key] = d.get(key, 0) + qty
d.setdefault(key, []).extend(values)

In simplest form, those two statements would now be coded more readably as:

d.count(key)
d.appendlist(key, value)

In their multi-value forms, they would now be coded as:

d.count(key, qty)
d.appendlist(key, *values)

The error messages returned by the new methods are the same as those
returned by the existing idioms.

The get() method would continue to exist because it is useful for
applications other than accumulation.

The setdefault() method would continue to exist but would likely not make
it into Py3.0.


PROBLEMS BEING SOLVED
---------------------

The readability issues with the existing constructs are:

* They are awkward to teach, create, read, and review.
* Their wording tends to hide the real meaning (accumulation).
* The meaning of setdefault() 's method name is not self-evident.

The performance issues with the existing constructs are:

* They translate into many opcodes which slows them considerably.
* The get() idiom requires two dictionary lookups of the same key.
* The setdefault() idiom instantiates a new, empty list prior to every
call. * That new list is often not needed and is immediately discarded.
* The setdefault() idiom requires an attribute lookup for extend/append.
* The setdefault() idiom makes two function calls.

The latter issues are evident from a disassembly:
dis(compile('d[key] = d.get(key, 0) + qty', '', 'exec'))

1 0 LOAD_NAME 0 (d)
3 LOAD_ATTR 1 (get)
6 LOAD_NAME 2 (key)
9 LOAD_CONST 0 (0)
12 CALL_FUNCTION 2
15 LOAD_NAME 3 (qty)
18 BINARY_ADD
19 LOAD_NAME 0 (d)
22 LOAD_NAME 2 (key)
25 STORE_SUBSCR
26 LOAD_CONST 1 (None)
29 RETURN_VALUE
dis(compile('d.setdefault(key, []).extend(values)', '', 'exec'))

1 0 LOAD_NAME 0 (d)
3 LOAD_ATTR 1 (setdefault)
6 LOAD_NAME 2 (key)
9 BUILD_LIST 0
12 CALL_FUNCTION 2
15 LOAD_ATTR 3 (extend)
18 LOAD_NAME 4 (values)
21 CALL_FUNCTION 1
24 POP_TOP
25 LOAD_CONST 0 (None)
28 RETURN_VALUE

In contrast, the proposed methods use only a single attribute lookup and
function call, they use only one dictionary lookup, they use very few
opcodes, and they directly access the accumulation functions,
PyNumber_Add() or PyList_Append(). IOW, the performance improvement
matches the readability improvement.


ISSUES
------

The proposed names could possibly be improved (perhaps tally() is more
active and clear than count()).

The appendlist() method is not as versatile as setdefault() which can be
used with other object types (perhaps for creating dictionaries of
dictionaries). However, most uses I've seen are with lists. For other
uses, plain Python code suffices in terms of speed, clarity, and avoiding
unnecessary instantiation of empty containers:

if key not in d:
d.key = {subkey:value}
else:
d[key][subkey] = value



Raymond Hettinger
 
M

Mike Rovner

Reinhold said:
I don't quite understand that. Which dict item are you extending? Don't
you need something like

dl[key].append("word")

Rigth. It was just a typo on my part. Thanks for fixing.

Mike
 
A

Aahz

I am surprised nobody suggested we put those two methods into a
separate module (say dictutils or even UserDict) as functions:

from dictutils import tally, listappend

tally(mydict, key)
listappend(mydict, key, value)

That seems like a reasonable compromise.
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code --
not in reams of trivial code that bores the reader to death." --GvR
 
D

David Eppstein

[email protected] (Aahz) said:
That seems like a reasonable compromise.

The more messages I see on this thread, the more I think adding a
different new method for each commonly used kind of update is the wrong
solution.

We already have methods that work pretty well and, I think, read better
than the new methods:
mydict[key] += 1
mydict[key].append(value)
The problem is merely that they don't work when key is missing, so we
need to resort to setdefault circumlocutions instead. A better solution
seems to be the one I've seen suggested here several times, of changing
the dict's behavior so that the setdefault is automatic whenever trying
to access a missing key. If this would be in a separate module or
separate subclass of dict, so much the better.
 
R

Ron

[email protected] (Aahz) said:
That seems like a reasonable compromise.

The more messages I see on this thread, the more I think adding a
different new method for each commonly used kind of update is the wrong
solution.

We already have methods that work pretty well and, I think, read better
than the new methods:
mydict[key] += 1
mydict[key].append(value)
The problem is merely that they don't work when key is missing, so we
need to resort to setdefault circumlocutions instead. A better solution
seems to be the one I've seen suggested here several times, of changing
the dict's behavior so that the setdefault is automatic whenever trying
to access a missing key. If this would be in a separate module or
separate subclass of dict, so much the better.


I think that the setdefault behavior needs to be done on an per
application basis because whose to say what default is best?.

With a preset default mode, it then becomes possible to inadvertently
create default values that will cause problems without knowing it. So
then we have to remember to change the setdefault value to None or
null to avoid problems. Ouch!

Also pythons normal behavior for retrieving objects that are not
defined is to give an error. So having dictionaries that auto
defaults to a mode that doesn't behave that way is inconsistent with
the rest of the language.

Yet, I'm all for the creation of specialized containers in a standard
module! :) Then we can have string dicts, and int dicts, and card
dicts, account dicts, etc, as well as specialized lists. Call them
'smart containers'. But they should not be built into the base class.

Ron
 
H

haraldarminmassa

Raymond,

I am +1 for both suggestions, tally and appendlist.

Extended:
Also, in all of my code base, I've not run across a single opportunity to use
something like unionset(). This is surprising because I'm the set() author and
frequently use set based algorithms. Your example was a good one and I can
also image a graph represented as a dictionary of sets. Still, I don't mind
writing out the plain Python for this one if it only comes up once in a blue
moon.

I am more than sure you are right about this. But, please keep in mind
that you and we all have come very, very accustomed to using lists for
everything and the kitchen sink in Python.

Lists where there from the beginning of Python, and even before the
birth of Python; very powerfull, well implemented and theoretically
well founded datastructures - I heared there is a whole language based
on list processing. *pun intended*

sets on the other hand --- I know, in mathematics they have a deep,
long history. But in programming? Yeah, in SQL and ABAP/4 basically
you are doing set operations on every join. But its rather uncommon to
call it set.

With 2.3 Python grew a set module. And, in ONLY ONE revision it got
promoted to a buildin type - a honour only those who read c.l.p.d.
regualary can value correctly.

And sets are SO NATURALLY for a lot of problems ... I never thought of
replacing my "list in dict" constructs with sets before, BUT ....
there are 90% of problem domains where order is not important, AND
fast membership testing is a unique sales point.

So please for best impressions: let us have a look at our code, where
we use the
dict.setdefault(key,[]).append() idiom, where it could be replaced to
a better effectivity with dict.setdefault(key,set()).add()

If it is less than 60%, forget it. If it is more....

Harald
 
A

AndrewN

Raymond said:
I would like to get everyone's thoughts on two new dictionary methods:

def count(self, value, qty=1):
try:
self[key] += qty
except KeyError:
self[key] = qty

def appendlist(self, key, *values):
try:
self[key].extend(values)
except KeyError:
self[key] = list(values)

-0.9

Not impressed, they feel too specific for being builtin dictionary
methods and give the impression of just trying to save a few lines here
and there. I don't feel the names convey the functionality of the
methods either.

I know there's the speed argument but I'd rather not have these on the
dict at all.

+0.1

I sort of feel a slight need for this. But where would you stop? What
if people decrement lots? what if next there's a need for division? How
would you determine how you add the item to the key if it already
exists? In a general way:

mydict.set(key, value=None, default=None, how=operator.setitem)

This feels slightly better as it's not tied down to what sort of item
you're setting. But:

I dunno, feels a bit verbose maybe.
The setdefault() method would continue to exist but would likely not make it
into Py3.0.

I agree that setdefault is wart though.

And for dict.default = value:

(Quoth RON):

"""With a preset default mode, it then becomes possible to
inadvertently
create default values that will cause problems without knowing it. So
then we have to remember to change the setdefault value to None or
null to avoid problems. Ouch!"""

Agreed, -1 there then.
PROBLEMS BEING SOLVED
---------------------

The readability issues with the existing constructs are:

* They are awkward to teach, create, read, and review.
* Their wording tends to hide the real meaning (accumulation).
* The meaning of setdefault() 's method name is not self-evident.

I feel this only really applies for setdefault (which I wouldn't be
sorry to see the back of). And your examples:

d[key] = d.get(key, 0) + qty
d.setdefault(key, []).extend(values)

Would better be written in a long-handed fashion anyway as per the
implementations were suggested:

try:
d[key] += qty
except KeyError:
d[key] = 0

Yeah, yeah, I know, speed. But not like this. Sorry.
 
M

Michele Simionato

FWIW, here is my take on the defaultdict approach:

def defaultdict(defaultfactory, dictclass=dict):
class defdict(dictclass):
def __getitem__(self, key):
try:
return super(defdict, self).__getitem__(key)
except KeyError:
return self.setdefault(key, defaultfactory())
return defdict

d = defaultdict(int)()
d["x"] += 1
d["x"] += 1
d["y"] += 1
print d

d = defaultdict(list)()
d["x"].append(1)
d["x"].append(2)
d["y"].append(1)
print d

Michele Simionato
 
G

George Sakkis

Michele Simionato said:
FWIW, here is my take on the defaultdict approach:

def defaultdict(defaultfactory, dictclass=dict):
class defdict(dictclass):
def __getitem__(self, key):
try:
return super(defdict, self).__getitem__(key)
except KeyError:
return self.setdefault(key, defaultfactory())
return defdict

d = defaultdict(int)()
d["x"] += 1
d["x"] += 1
d["y"] += 1
print d

d = defaultdict(list)()
d["x"].append(1)
d["x"].append(2)
d["y"].append(1)
print d

Michele Simionato


Best solution so far. If it wasn't for the really bad decision to add the dict(**kwargs)
constructor, I'd love to see something like
d = dict(valType=int)
d["x"] += 1

George
 
E

Evan Simpson

Raymond said:
I would like to get everyone's thoughts on two new dictionary methods:

def count(self, value, qty=1):

def appendlist(self, key, *values):

-1.0

When I need these, I just use subtype recipes. They seem way too
special-purpose for the base dict type.

class Counter(dict):
def __iadd__(self, other):
if other in self:
self[other] += 1
else:
self[other] = 1
return self

c = Counter()
for item in items:
c += item

class Collector(dict):
def add(self, key, value):
if key in self:
self[key].append(value)
else:
self[key] = [value]

c = Collector()
for k,v in items:
c.add(k, v)

Cheers,

Evan @ 4-am
 
G

Greg Ewing

Michele said:
def defaultdict(defaultfactory, dictclass=dict):
class defdict(dictclass):
def __getitem__(self, key):
try:
return super(defdict, self).__getitem__(key)
except KeyError:
return self.setdefault(key, defaultfactory())
return defdict

That looks really nice!

I'd prefer a more elegant name than 'defaultdict', though.
How about 'table'?
 
R

Roose

I agree -- I find myself NEEDING sets more and more. I use them with this
idiom quite often. Once they become more widely available (i.e. Python 2.3
is installed everywhere), I will use them almost as much as lists I bet.

So any solution IMO needs to at least encompass sets. But preferably
something like the Dict with Default approach which encompasses all
possibilities.

Roose
 
S

Steve Holden

Greg said:
That looks really nice!

I'd prefer a more elegant name than 'defaultdict', though.
How about 'table'?
By obvious analogy with Icon (where the dictionary-like object was
created with the option of a default value) this gets my +1.

regards
Steve
 
B

bearophileHUGS

R.H.:
The setdefault() method would continue to exist but
would likely not make it into Py3.0.

I agee to remove the setdefault.

I like the new count method, but I don't like the appendlist method,
because I think it's too much specilized.

I too use sets a lot; recently I've suggested to add a couple of set
methods to dicts (working on the keys): intersection() and
difference().

Bearophile
 
C

Christos TZOTZIOY Georgiou

I would like to get everyone's thoughts on two new dictionary methods:

def count(self, value, qty=1):
try:
self[key] += qty
except KeyError:
self[key] = qty

def appendlist(self, key, *values):
try:
self[key].extend(values)
except KeyError:
self[key] = list(values)

Both are useful and often needed, so I am +1 on adding such
functionality. However, I am -0 on adding methods to dict.

I believe BJörn Lindqvist suggested a subtype of dict instead, which
feels more right. I believe this is a type of 'bag' collection, and it
could go to the collections module.

The default argument 99% of the time is the same for all calls to
setdefault of a specific instance. So I would suggest that the default
argument should be an attribute of the bag instance, given at instance
creation. And since unbound methods are going to stay, we can use the
accumulator method as a default argument (ie int.__add__ or list.append)

Based on the above, I would suggest something like the following
implementation, waiting criticism on names, algorithm or applicability:)

..class bag(dict):
.. def __init__(self, accumulator=int.__add__):
.. self.accumulator = accumulator
..
.. # refinement needed for the following
.. self.accu_class = accumulator.__objclass__
..
.. # if there was an exception, probably the accumulator
.. # provided was not appropriate
..
.. def accumulate(self, key, value):
.. try:
.. old_value = self[key]
.. except KeyError:
.. self[key] = old_value = self.accu_class()
.. new_value = self.accumulator(old_value, item)
..
.. # and this needs refinement
.. if new_value is not None: # method of immutable object
.. self[key] = new_value

This works ok for int.__add__ and list.append.

PS I wrote these more than 36 hours ago, and before having read the
so-far downloaded messages of the thread. I kept on reading and
obviously others thought the same too (default argument at
initialisation).

What the heck, Bengt at least could like the class method idea :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,608
Members
45,252
Latest member
MeredithPl

Latest Threads

Top