Pythonic way for missing dict keys

  • Thread starter Bruno Desthuilliers
  • Start date
B

Bruno Desthuilliers

Alex Popescu a écrit :
Hi all!

I am pretty sure this has been asked a couple of times, but I don't seem
to find it on the archives (Google seems to have a couple of problems
lately).

I am wondering what is the most pythonic way of dealing with missing
keys and default values.

According to my readings one can take the following approaches:

1/ check before (this has a specific name and acronym that I haven't
learnt yet by heart)

if not my_dict.has_key(key):
my_obj = myobject()
my_dict[key] = my_obj
else:
my_obj = my_dict[key]

if key not in my_dict:
my_obj = my_dict[key] = myobject()
else:
my_obj = my_dict[key]
2/ try and react on error (this has also a specific name, but...)

try:
my_obj = my_dict[key]
except AttributeError:
my_obj = myobject()
my_dict[key] = my_obj

cf above for a shortcut...
3/ dict.get usage:

my_obj = my_dict.get(key, myobject())

Note that this last one won't have the same result, since it won't store
my_obj under my_dict[key]. You'd have to use dict.setdefault :

my_obj = my_dict.setdefault(key, myobject())
I am wondering which one is the most recommended way?

It depends on the context. wrt/ 1 and 2, use 1 if you expect that most
of the time, my_dict[key] will not be set, and 2 if you expect that most
of the time, my_dict[key] will be set.
get usage seems
the clearest, but the only problem I see is that I think myobject() is
evaluated at call time,

Myobject will be instanciated each time, yes.
and so if the initialization is expensive you
will probably see surprises.

No "surprise" here, but it can indeed be suboptimal if instanciating
myobject is costly.
 
A

Alex Popescu

Hi all!

I am pretty sure this has been asked a couple of times, but I don't seem
to find it on the archives (Google seems to have a couple of problems
lately).

I am wondering what is the most pythonic way of dealing with missing
keys and default values.

According to my readings one can take the following approaches:

1/ check before (this has a specific name and acronym that I haven't
learnt yet by heart)

if not my_dict.has_key(key):
my_obj = myobject()
my_dict[key] = my_obj
else:
my_obj = my_dict[key]

2/ try and react on error (this has also a specific name, but...)

try:
my_obj = my_dict[key]
except AttributeError:
my_obj = myobject()
my_dict[key] = my_obj

3/ dict.get usage:

my_obj = my_dict.get(key, myobject())

I am wondering which one is the most recommended way? get usage seems
the clearest, but the only problem I see is that I think myobject() is
evaluated at call time, and so if the initialization is expensive you
will probably see surprises.

thanks in advance,
../alex
 
N

Neil Cerutti

Hi all!

I am pretty sure this has been asked a couple of times, but I
don't seem to find it on the archives (Google seems to have a
couple of problems lately).

I am wondering what is the most pythonic way of dealing with missing
keys and default values.

According to my readings one can take the following approaches:

There's also the popular collections.defaultdict.

Usually, the get method of normal dicts is what I want. I use a
defaultdict only when the implicit addition to the dictionary of
defaulted elements is what I really want.
 
A

Alex Popescu

There's also the popular collections.defaultdict.

Usually, the get method of normal dicts is what I want. I use a
defaultdict only when the implicit addition to the dictionary of
defaulted elements is what I really want.

This looks like the closest to my needs, but in my case the default value
involves the creation of a custom object instance that is taking parameters
from the current execution context, so I am not very sure I can use it.

../alex
 
J

Jakub Stolarski

Version 1 and 2 do different thing than version 3. The latter doesn't
add value to dict.

As it was mentioned before, use:
1 - if you expect that there's no key in dict
2 - if you expect that there is key in dict
 
A

Alex Popescu

Version 1 and 2 do different thing than version 3. The latter doesn't
add value to dict.

As it was mentioned before, use:
1 - if you expect that there's no key in dict
2 - if you expect that there is key in dict

I may be missing something but I think the 3 approaches are completely
equivalent in terms of functionality.

../alex
 
C

Carsten Haese

This looks like the closest to my needs, but in my case the default value
involves the creation of a custom object instance that is taking parameters
from the current execution context, so I am not very sure I can use it.

If by "current execution context" you mean globally visible names, this
should still be possible:
.... return x+y
....
dd = defaultdict(make_default)
x = 40
y = 2
print dd[0] 42
x = "Dead"
y = " Parrot"
print dd[1] Dead Parrot
print dd
defaultdict(<function make_default at 0xb7f71e2c>, {0: 42, 1: 'Dead Parrot'})

HTH,
 
S

Steven D'Aprano

I am wondering what is the most pythonic way of dealing with missing
keys and default values.

[snip three versions]

Others have already mentioned the collections.defaultdict type, however it
seems people have forgotten about the setdefault method of dictionaries.

value = somedict.setdefault(key, defaultvalue)

The disadvantage of setdefault is that the defaultvalue has to be created
up front. The disadvantage of collections.defaultdict is that the "default
factory" function takes no arguments, which makes it rather less than
convenient. One can work around this using global variables:

# The default value is expensive to calculate, and known
# only at runtime.

# lots of code goes here...

# set context just before fetching from the default dict
context = 42
value = D['key']
print value, D
42 defaultdict(<function <lambda> at 0xb7eb4fb4>, {'key': '42'})

but one should be very leery of relying on global variables like that.

That suggests the best solution is something like this:

def getdefault(adict, key, expensivefunction, context):
if key in adict:
return adict[key]
else:
value = expensivefunction(context)
adict[key] = value
return value
 
C

Carsten Haese

Can someone who knows about python internals throw some light on why

??

I won't claim to know Python internals, but compiling and disassembling the
expressions in question reveals the reason:
1 0 LOAD_NAME 0 (dic)
3 LOAD_ATTR 1 (has_key)
6 LOAD_NAME 2 (x)
9 CALL_FUNCTION 1
12 RETURN_VALUE 1 0 LOAD_NAME 0 (x)
3 LOAD_NAME 1 (dic)
6 COMPARE_OP 6 (in)
9 RETURN_VALUE

"dic.has_key(x)" goes through an attribute lookup to find the function that
looks for the key. "x in dic" finds the function more directly.
 
A

Alex Martelli

Carsten Haese said:
I won't claim to know Python internals, but compiling and disassembling the
expressions in question reveals the reason:

1 0 LOAD_NAME 0 (dic)
3 LOAD_ATTR 1 (has_key)
6 LOAD_NAME 2 (x)
9 CALL_FUNCTION 1
12 RETURN_VALUE
1 0 LOAD_NAME 0 (x)
3 LOAD_NAME 1 (dic)
6 COMPARE_OP 6 (in)
9 RETURN_VALUE

"dic.has_key(x)" goes through an attribute lookup to find the function that
looks for the key. "x in dic" finds the function more directly.

Yup, it's mostly that, as microbenchmarking can confirm:

brain:~ alex$ python -mtimeit -s'd={}; f=d.has_key' 'f(23)'
10000000 loops, best of 3: 0.146 usec per loop
brain:~ alex$ python -mtimeit -s'd={}; f=d.has_key' '23 in d'
10000000 loops, best of 3: 0.142 usec per loop
brain:~ alex$ python -mtimeit -s'd={}; f=d.has_key' 'f(23)'
10000000 loops, best of 3: 0.146 usec per loop
brain:~ alex$ python -mtimeit -s'd={}; f=d.has_key' '23 in d'
10000000 loops, best of 3: 0.142 usec per loop
brain:~ alex$ python -mtimeit -s'd={}; f=d.has_key' 'd.has_key(23)'
1000000 loops, best of 3: 0.278 usec per loop
brain:~ alex$ python -mtimeit -s'd={}; f=d.has_key' 'd.has_key(23)'
1000000 loops, best of 3: 0.275 usec per loop

the in operator still appears to have a tiny repeatable advantage (about
4 nanoseconds on my laptop) wrt even the hoisted method, but the
non-hoisted method, due to repeated lookup, is almost twice as slow
(over 100 nanoseconds penalty, on my laptop).


Alex
 
D

Duncan Booth

Rustom Mody said:
Can someone who knows about python internals throw some light on why

??
Some special methods are optimised by having a reserved slot in the data
structure used to implement a class. The 'in' operator uses one of these
slots so it can bypass all the overheads of looking up an attribute such as
'has_key'.
 
Z

Zentrader

From the 2.6 PEP #361 (looks like dict.has_key is deprecated)
Python 3.0 compatability: ['compatibility'-->someone should use a
spell-checker for 'official' releases]
- warnings were added for the following builtins which no
longer exist in 3.0:
apply, callable, coerce, dict.has_key, execfile, reduce,
reload
 
G

genro

Myobject will be instanciated each time, yes.


No "surprise" here, but it can indeed be suboptimal if instanciating
myobject is costly.

What about this way ?

my_obj = my_dict.get(key) or my_dict.setdefault(key,myobject())

Ciao
G.
 
M

Marc 'BlackJack' Rintsch

What about this way ?

my_obj = my_dict.get(key) or my_dict.setdefault(key,myobject())

Reduces the unnecessary instantiation of `myobject` to "false" objects.
May be not good enough.

Ciao,
Marc 'BlackJack' Rintsch
 
A

Alex Popescu

[snip...]

From the 2.6 PEP #361 (looks like dict.has_key is deprecated)
Python 3.0 compatability: ['compatibility'-->someone should use a
spell-checker for 'official' releases]
- warnings were added for the following builtins which no
longer exist in 3.0:
apply, callable, coerce, dict.has_key, execfile, reduce,
reload

I see... what that document doesn't describe is the alternatives to be
used. And I see in that list a couple of functions that are probably used a
lot nowadays (callable, reduce, etc.).

bests,
../alex
 
J

John J. Lee

Alex Popescu said:
[snip...]

From the 2.6 PEP #361 (looks like dict.has_key is deprecated)
Python 3.0 compatability: ['compatibility'-->someone should use a
spell-checker for 'official' releases]
- warnings were added for the following builtins which no
longer exist in 3.0:
apply, callable, coerce, dict.has_key, execfile, reduce,
reload

I see... what that document doesn't describe is the alternatives to be
used. And I see in that list a couple of functions that are probably used a
lot nowadays (callable, reduce, etc.).

callable and reduce are rarely used, at least in code I've seen. I
would agree there will be a large number of programs that contain one
or two calls to these functions, though. Certainly has_key will be
the most common of those listed above (but trivial to fix). apply
will be common in old code from the time of Python 1.5.2. execfile is
perhaps more common that callable (?) but again is really a "maybe 1
call in a big program" sort of thing. Anybody using coerce or reload
deserves to lose ;-)


John
 
A

Alex Popescu

Alex Popescu said:
On Jul 21, 7:48 am, Duncan Booth <[email protected]>
wrote:

[snip...]


From the 2.6 PEP #361 (looks like dict.has_key is deprecated)
Python 3.0 compatability: ['compatibility'-->someone should use a
spell-checker for 'official' releases]
- warnings were added for the following builtins which no
longer exist in 3.0:
apply, callable, coerce, dict.has_key, execfile,
reduce,
reload

I see... what that document doesn't describe is the alternatives to
be used. And I see in that list a couple of functions that are
probably used a lot nowadays (callable, reduce, etc.).

callable and reduce are rarely used, at least in code I've seen.

I thought G would be using that function a lot. Also, what is the
replacement of reduce? I think I remember seeing somewhere that lists
comprehension would be (but also remember the advise that reduce will be
quicker).
Certainly has_key will be
the most common of those listed above (but trivial to fix).

dict.has_key(key) becomes key in dict (correct?)
apply
will be common in old code from the time of Python 1.5.2.

I think there were some advises to not use apply.
execfile is
perhaps more common that callable (?) but again is really a "maybe 1
call in a big program" sort of thing.

What is the replacement for this one?

tia,
../alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top