dictionary of dictionaries

kettle · Dec 9, 2007

Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?

In perl I would just do something like:

my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}
}

but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.
-joe

Marc 'BlackJack' Rintsch · Dec 9, 2007

Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?

In perl I would just do something like:

my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}
}

but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.

Use `collections.defaultdict`:

from collections import defaultdict
from random import randint

data = defaultdict(dict)
for i in xrange(11):
for j in xrange(11):
data[j] = randint(0, 10)

If the keys `i` and `j` are not "independent" you might use a "flat"
dictionary with a tuple of both as keys:

data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in xrange(11))

And just for completeness: The given data in the example can be stored in a
list of lists of course:

data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]

Ciao,
Marc 'BlackJack' Rintsch

kettle · Dec 10, 2007

Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?

Click to expand...

In perl I would just do something like:

Click to expand...

my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}
}

Click to expand...

but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.

Click to expand...

Use `collections.defaultdict`:

from collections import defaultdict
from random import randint

data = defaultdict(dict)
for i in xrange(11):
for j in xrange(11):
data[j] = randint(0, 10)

If the keys `i` and `j` are not "independent" you might use a "flat"
dictionary with a tuple of both as keys:

data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in xrange(11))

And just for completeness: The given data in the example can be stored in a
list of lists of course:

data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]

Ciao,
Marc 'BlackJack' Rintsch

Thanks for the heads up. Indeed it's just as nice as perl. One more
question though, this defaultdict seems to only work with python2.5+
in the case of python < 2.5 it seems I have to do something like:
#!/usr/bin/python
from random import randint

dict_dict = {}
for x in xrange(10):
for y in xrange(10):
r = randint(0,10)
try:
dict_dict[x][y] = r
except:
if x in dict_dict:
dict_dict[x][y] = r
else:
dict_dict[x] = {}
dict_dict[x][y] = r

what I really want to / need to be able to do is autoincrement the
values when I hit another word. Again in perl I'd just do something
like:

my %my_hash;
while(<FILE>){
chomp;
@_ = split(/\s+/);
grep{$my_hash{$_}++} @_;
}

and this generalizes transparently to a hash of hashes or hash of a
hash of hashes etc. In python < 2.5 this seems to require something
like:

for line in file:
words = line.split()
for word in words:
my_dict[word] = 1 + my_dict.get(word, 0)

which I guess I can generalize to a dict of dicts but it seems it will
require more if/else statements to check whether or not the higher-
level keys exist. I guess the real answer is that I should just
migrate to python2.5...!

-joe

Peter Otten · Dec 10, 2007

kettle said:
Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?

Click to expand...

In perl I would just do something like:

Click to expand...

my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}
}

Click to expand...

but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.

Click to expand...

Use `collections.defaultdict`:

from collections import defaultdict
from random import randint

data = defaultdict(dict)
for i in xrange(11):
for j in xrange(11):
data[j] = randint(0, 10)

If the keys `i` and `j` are not "independent" you might use a "flat"
dictionary with a tuple of both as keys:

data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in xrange(11))

And just for completeness: The given data in the example can be stored in a
list of lists of course:

data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]

Ciao,
Marc 'BlackJack' Rintsch

Click to expand...

Thanks for the heads up. Indeed it's just as nice as perl. One more
question though, this defaultdict seems to only work with python2.5+
in the case of python < 2.5 it seems I have to do something like:
#!/usr/bin/python
from random import randint

dict_dict = {}
for x in xrange(10):
for y in xrange(10):
r = randint(0,10)
try:
dict_dict[x][y] = r
except:
if x in dict_dict:
dict_dict[x][y] = r
else:
dict_dict[x] = {}
dict_dict[x][y] = r

You can clean that up a bit:

from random import randrange

dict_dict = {}
for x in xrange(10):
dict_dict[x] = dict((y, randrange(11)) for y in xrange(10))

what I really want to / need to be able to do is autoincrement the
values when I hit another word. Again in perl I'd just do something
like:

my %my_hash;
while(<FILE>){
chomp;
@_ = split(/\s+/);
grep{$my_hash{$_}++} @_;
}

and this generalizes transparently to a hash of hashes or hash of a
hash of hashes etc. In python < 2.5 this seems to require something
like:

for line in file:
words = line.split()
for word in words:
my_dict[word] = 1 + my_dict.get(word, 0)

which I guess I can generalize to a dict of dicts but it seems it will
require more if/else statements to check whether or not the higher-
level keys exist. I guess the real answer is that I should just
migrate to python2.5...!

Click to expand...

Well, there's also dict.setdefault()

pairs = ["ab", "ab", "ac", "bc"]
outer = {}
for a, b in pairs:

Click to expand...

Click to expand...

Click to expand...

.... inner = outer.setdefault(a, {})
.... inner = inner.get(b, 0) + 1
.... {'a': {'c': 1, 'b': 2}, 'b': {'c': 1}}

and it's not hard to write your own defaultdict
.... def __getitem__(self, key):
.... return self.get(key, 0)
....

d = Dict()
for c in "abbbcdeafgh": d[c] += 1 ....
d

Click to expand...

Click to expand...

Click to expand...

{'a': 2, 'c': 1, 'b': 3, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'h': 1}

Peter

kettle · Dec 11, 2007

kettle said:
kettle said:

On Sun, 09 Dec 2007 00:35:18 -0800, kettle wrote:
Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?
In perl I would just do something like:
my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}
}
but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.
Use `collections.defaultdict`:
from collections import defaultdict
from random import randint
data = defaultdict(dict)
for i in xrange(11):
for j in xrange(11):
data[j] = randint(0, 10)
If the keys `i` and `j` are not "independent" you might use a "flat"
dictionary with a tuple of both as keys:
data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in xrange(11))
And just for completeness: The given data in the example can be stored in a
list of lists of course:
data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]
Ciao,
Marc 'BlackJack' Rintsch

Click to expand...

Click to expand...

Thanks for the heads up. Indeed it's just as nice as perl. One more
question though, this defaultdict seems to only work with python2.5+
in the case of python < 2.5 it seems I have to do something like:
#!/usr/bin/python
from random import randint

Click to expand...

dict_dict = {}
for x in xrange(10):
for y in xrange(10):
r = randint(0,10)
try:
dict_dict[x][y] = r
except:
if x in dict_dict:
dict_dict[x][y] = r
else:
dict_dict[x] = {}
dict_dict[x][y] = r

Click to expand...

You can clean that up a bit:

from random import randrange

dict_dict = {}
for x in xrange(10):
dict_dict[x] = dict((y, randrange(11)) for y in xrange(10))

what I really want to / need to be able to do is autoincrement the
values when I hit another word. Again in perl I'd just do something
like:

Click to expand...

my %my_hash;
while(<FILE>){
chomp;
@_ = split(/\s+/);
grep{$my_hash{$_}++} @_;
}

Click to expand...

and this generalizes transparently to a hash of hashes or hash of a
hash of hashes etc. In python < 2.5 this seems to require something
like:

Click to expand...

for line in file:
words = line.split()
for word in words:
my_dict[word] = 1 + my_dict.get(word, 0)

Click to expand...

which I guess I can generalize to a dict of dicts but it seems it will
require more if/else statements to check whether or not the higher-
level keys exist. I guess the real answer is that I should just
migrate to python2.5...!

Click to expand...

Well, there's also dict.setdefault()

pairs = ["ab", "ab", "ac", "bc"]
outer = {}
for a, b in pairs:

Click to expand...

Click to expand...

... inner = outer.setdefault(a, {})
... inner = inner.get(b, 0) + 1
...>>> outer

{'a': {'c': 1, 'b': 2}, 'b': {'c': 1}}

and it's not hard to write your own defaultdict

... def __getitem__(self, key):
... return self.get(key, 0)
...>>> d = Dict()

for c in "abbbcdeafgh": d[c] += 1 ...
d

Click to expand...

Click to expand...

{'a': 2, 'c': 1, 'b': 3, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'h': 1}

Peter

Nice, thanks for all the tips! I knew there had to be some handier
python ways to do these things. My initial attempts were just what
occurred to me first given my still limited knowledge of the language
and its idioms. Thanks again! -joe

kettle · Dec 11, 2007

kettle said:
kettle said:

On Sun, 09 Dec 2007 00:35:18 -0800, kettle wrote:
Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?
In perl I would just do something like:
my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}
}
but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.
Use `collections.defaultdict`:
from collections import defaultdict
from random import randint
data = defaultdict(dict)
for i in xrange(11):
for j in xrange(11):
data[j] = randint(0, 10)
If the keys `i` and `j` are not "independent" you might use a "flat"
dictionary with a tuple of both as keys:
data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in xrange(11))
And just for completeness: The given data in the example can be stored in a
list of lists of course:
data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]
Ciao,
Marc 'BlackJack' Rintsch

Click to expand...

Click to expand...

Thanks for the heads up. Indeed it's just as nice as perl. One more
question though, this defaultdict seems to only work with python2.5+
in the case of python < 2.5 it seems I have to do something like:
#!/usr/bin/python
from random import randint

Click to expand...

dict_dict = {}
for x in xrange(10):
for y in xrange(10):
r = randint(0,10)
try:
dict_dict[x][y] = r
except:
if x in dict_dict:
dict_dict[x][y] = r
else:
dict_dict[x] = {}
dict_dict[x][y] = r

Click to expand...

You can clean that up a bit:

from random import randrange

dict_dict = {}
for x in xrange(10):
dict_dict[x] = dict((y, randrange(11)) for y in xrange(10))

what I really want to / need to be able to do is autoincrement the
values when I hit another word. Again in perl I'd just do something
like:

Click to expand...

my %my_hash;
while(<FILE>){
chomp;
@_ = split(/\s+/);
grep{$my_hash{$_}++} @_;
}

Click to expand...

and this generalizes transparently to a hash of hashes or hash of a
hash of hashes etc. In python < 2.5 this seems to require something
like:

Click to expand...

for line in file:
words = line.split()
for word in words:
my_dict[word] = 1 + my_dict.get(word, 0)

Click to expand...

which I guess I can generalize to a dict of dicts but it seems it will
require more if/else statements to check whether or not the higher-
level keys exist. I guess the real answer is that I should just
migrate to python2.5...!

Click to expand...

Well, there's also dict.setdefault()

pairs = ["ab", "ab", "ac", "bc"]
outer = {}
for a, b in pairs:

Click to expand...

Click to expand...

... inner = outer.setdefault(a, {})
... inner = inner.get(b, 0) + 1
...>>> outer

{'a': {'c': 1, 'b': 2}, 'b': {'c': 1}}

and it's not hard to write your own defaultdict

... def __getitem__(self, key):
... return self.get(key, 0)
...>>> d = Dict()

for c in "abbbcdeafgh": d[c] += 1 ...
d

Click to expand...

Click to expand...

{'a': 2, 'c': 1, 'b': 3, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'h': 1}

Peter

One last question. I've heard the 'Explicit vs. Implicit' argument
but this seems to boil down to a question of general usage case
scenarios and what most people 'expect' for default behavior. The
above defaultdict implementation defining the __getitem__ method seems
like it is more generally useful than the real default. What is the
reasoning behind NOT using this as the default implementation for a
dict in python?

Marc 'BlackJack' Rintsch · Dec 11, 2007

Well, there's also dict.setdefault()

pairs = ["ab", "ab", "ac", "bc"]
outer = {}
for a, b in pairs:

Click to expand...

... inner = outer.setdefault(a, {})
... inner = inner.get(b, 0) + 1
...>>> outer

{'a': {'c': 1, 'b': 2}, 'b': {'c': 1}}

and it's not hard to write your own defaultdict

class Dict(dict):

Click to expand...

... def __getitem__(self, key):
... return self.get(key, 0)
...>>> d = Dict()

for c in "abbbcdeafgh": d[c] += 1 ...
d

Click to expand...

{'a': 2, 'c': 1, 'b': 3, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'h': 1}

Peter

Click to expand...

One last question. I've heard the 'Explicit vs. Implicit' argument
but this seems to boil down to a question of general usage case
scenarios and what most people 'expect' for default behavior. The
above defaultdict implementation defining the __getitem__ method seems
like it is more generally useful than the real default. What is the
reasoning behind NOT using this as the default implementation for a
dict in python?

How's that more useful in the general case? Maybe if you come from a
language where some default value pops up if the key is not present you
are used to write code in a way that exploits this fact. But in the
general case!? I need `defaultdict` not very often but want to know if a
key is not present in a dictionary. Because most of the time that's a
special condition or error that has to be handled or signaled up the call
chain.

Ciao,
Marc 'BlackJack' Rintsch

Kay Schluehr · Dec 11, 2007

Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?

In perl I would just do something like:

my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}

}

but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.
-joe

You might produce the behaviour of the hash_of_hashes type directly:

class dict_of_dicts(dict):
def __getitem__(self, key):
d = dict.get(self, key, {})
self[key] = d
return d

def __setitem__(self, key, val):
assert isinstance(val, dict), "Value of type dict expected. %s
found instead."%(type(val))
dict.__setitem__(self, key, val)

d = dict_of_dicts()
d[0][1] = "A"
d[1][1] = "B"
d[1][2] = "C"
d {0: {1: 'A'}, 1: {1: 'B', 2: 'C'}}
d[0] = 0 # expects values of type dict

Click to expand...

Click to expand...

Traceback
....
AssertionError: Value of type dict expected. <type 'int'> found
instead.

Building and accessing an array of dictionaries	4	Jan 16, 2014
Two Dictionaries and a Sum!	5	May 18, 2013
nested dictionaries and functions in data structures.	0	Jan 7, 2014
Dictionaries	2	Aug 17, 2009
Dictionaries with tuples or tuples of tuples	18	Feb 19, 2013
Compile time evaluation of dictionaries	2	Mar 10, 2011
shelve and nested dictionaries	1	Jan 3, 2008
Help building a dictionary of lists	1	Nov 12, 2012

dictionary of dictionaries

kettle

Marc 'BlackJack' Rintsch

kettle

Peter Otten

kettle

kettle

Marc 'BlackJack' Rintsch

Kay Schluehr

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads