dictionary of dictionaries

K

kettle

Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?

In perl I would just do something like:

my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}
}

but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.
-joe
 
M

Marc 'BlackJack' Rintsch

Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?

In perl I would just do something like:

my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}
}

but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.

Use `collections.defaultdict`:

from collections import defaultdict
from random import randint

data = defaultdict(dict)
for i in xrange(11):
for j in xrange(11):
data[j] = randint(0, 10)

If the keys `i` and `j` are not "independent" you might use a "flat"
dictionary with a tuple of both as keys:

data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in xrange(11))

And just for completeness: The given data in the example can be stored in a
list of lists of course:

data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]

Ciao,
Marc 'BlackJack' Rintsch
 
K

kettle

Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?
In perl I would just do something like:
my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}
}
but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.

Use `collections.defaultdict`:

from collections import defaultdict
from random import randint

data = defaultdict(dict)
for i in xrange(11):
for j in xrange(11):
data[j] = randint(0, 10)

If the keys `i` and `j` are not "independent" you might use a "flat"
dictionary with a tuple of both as keys:

data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in xrange(11))

And just for completeness: The given data in the example can be stored in a
list of lists of course:

data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]

Ciao,
Marc 'BlackJack' Rintsch


Thanks for the heads up. Indeed it's just as nice as perl. One more
question though, this defaultdict seems to only work with python2.5+
in the case of python < 2.5 it seems I have to do something like:
#!/usr/bin/python
from random import randint

dict_dict = {}
for x in xrange(10):
for y in xrange(10):
r = randint(0,10)
try:
dict_dict[x][y] = r
except:
if x in dict_dict:
dict_dict[x][y] = r
else:
dict_dict[x] = {}
dict_dict[x][y] = r

what I really want to / need to be able to do is autoincrement the
values when I hit another word. Again in perl I'd just do something
like:

my %my_hash;
while(<FILE>){
chomp;
@_ = split(/\s+/);
grep{$my_hash{$_}++} @_;
}

and this generalizes transparently to a hash of hashes or hash of a
hash of hashes etc. In python < 2.5 this seems to require something
like:

for line in file:
words = line.split()
for word in words:
my_dict[word] = 1 + my_dict.get(word, 0)

which I guess I can generalize to a dict of dicts but it seems it will
require more if/else statements to check whether or not the higher-
level keys exist. I guess the real answer is that I should just
migrate to python2.5...!

-joe
 
P

Peter Otten

kettle said:
Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?
In perl I would just do something like:
my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}
}
but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.

Use `collections.defaultdict`:

from collections import defaultdict
from random import randint

data = defaultdict(dict)
for i in xrange(11):
for j in xrange(11):
data[j] = randint(0, 10)

If the keys `i` and `j` are not "independent" you might use a "flat"
dictionary with a tuple of both as keys:

data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in xrange(11))

And just for completeness: The given data in the example can be stored in a
list of lists of course:

data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]

Ciao,
Marc 'BlackJack' Rintsch


Thanks for the heads up. Indeed it's just as nice as perl. One more
question though, this defaultdict seems to only work with python2.5+
in the case of python < 2.5 it seems I have to do something like:
#!/usr/bin/python
from random import randint

dict_dict = {}
for x in xrange(10):
for y in xrange(10):
r = randint(0,10)
try:
dict_dict[x][y] = r
except:
if x in dict_dict:
dict_dict[x][y] = r
else:
dict_dict[x] = {}
dict_dict[x][y] = r


You can clean that up a bit:

from random import randrange

dict_dict = {}
for x in xrange(10):
dict_dict[x] = dict((y, randrange(11)) for y in xrange(10))
what I really want to / need to be able to do is autoincrement the
values when I hit another word. Again in perl I'd just do something
like:

my %my_hash;
while(<FILE>){
chomp;
@_ = split(/\s+/);
grep{$my_hash{$_}++} @_;
}

and this generalizes transparently to a hash of hashes or hash of a
hash of hashes etc. In python < 2.5 this seems to require something
like:

for line in file:
words = line.split()
for word in words:
my_dict[word] = 1 + my_dict.get(word, 0)

which I guess I can generalize to a dict of dicts but it seems it will
require more if/else statements to check whether or not the higher-
level keys exist. I guess the real answer is that I should just
migrate to python2.5...!

Well, there's also dict.setdefault()
pairs = ["ab", "ab", "ac", "bc"]
outer = {}
for a, b in pairs:
.... inner = outer.setdefault(a, {})
.... inner = inner.get(b, 0) + 1
.... {'a': {'c': 1, 'b': 2}, 'b': {'c': 1}}

and it's not hard to write your own defaultdict
.... def __getitem__(self, key):
.... return self.get(key, 0)
....
d = Dict()
for c in "abbbcdeafgh": d[c] += 1 ....
d
{'a': 2, 'c': 1, 'b': 3, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'h': 1}

Peter
 
K

kettle

kettle said:
On Sun, 09 Dec 2007 00:35:18 -0800, kettle wrote:
Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?
In perl I would just do something like:
my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}
}
but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.
Use `collections.defaultdict`:
from collections import defaultdict
from random import randint
data = defaultdict(dict)
for i in xrange(11):
for j in xrange(11):
data[j] = randint(0, 10)
If the keys `i` and `j` are not "independent" you might use a "flat"
dictionary with a tuple of both as keys:
data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in xrange(11))
And just for completeness: The given data in the example can be stored in a
list of lists of course:
data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]
Ciao,
Marc 'BlackJack' Rintsch

Thanks for the heads up. Indeed it's just as nice as perl. One more
question though, this defaultdict seems to only work with python2.5+
in the case of python < 2.5 it seems I have to do something like:
#!/usr/bin/python
from random import randint
dict_dict = {}
for x in xrange(10):
for y in xrange(10):
r = randint(0,10)
try:
dict_dict[x][y] = r
except:
if x in dict_dict:
dict_dict[x][y] = r
else:
dict_dict[x] = {}
dict_dict[x][y] = r

You can clean that up a bit:

from random import randrange

dict_dict = {}
for x in xrange(10):
dict_dict[x] = dict((y, randrange(11)) for y in xrange(10))


what I really want to / need to be able to do is autoincrement the
values when I hit another word. Again in perl I'd just do something
like:
my %my_hash;
while(<FILE>){
chomp;
@_ = split(/\s+/);
grep{$my_hash{$_}++} @_;
}
and this generalizes transparently to a hash of hashes or hash of a
hash of hashes etc. In python < 2.5 this seems to require something
like:
for line in file:
words = line.split()
for word in words:
my_dict[word] = 1 + my_dict.get(word, 0)
which I guess I can generalize to a dict of dicts but it seems it will
require more if/else statements to check whether or not the higher-
level keys exist. I guess the real answer is that I should just
migrate to python2.5...!

Well, there's also dict.setdefault()
pairs = ["ab", "ab", "ac", "bc"]
outer = {}
for a, b in pairs:

... inner = outer.setdefault(a, {})
... inner = inner.get(b, 0) + 1
...>>> outer

{'a': {'c': 1, 'b': 2}, 'b': {'c': 1}}

and it's not hard to write your own defaultdict

... def __getitem__(self, key):
... return self.get(key, 0)
...>>> d = Dict()
for c in "abbbcdeafgh": d[c] += 1 ...
d

{'a': 2, 'c': 1, 'b': 3, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'h': 1}

Peter


Nice, thanks for all the tips! I knew there had to be some handier
python ways to do these things. My initial attempts were just what
occurred to me first given my still limited knowledge of the language
and its idioms. Thanks again! -joe
 
K

kettle

kettle said:
On Sun, 09 Dec 2007 00:35:18 -0800, kettle wrote:
Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?
In perl I would just do something like:
my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}
}
but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.
Use `collections.defaultdict`:
from collections import defaultdict
from random import randint
data = defaultdict(dict)
for i in xrange(11):
for j in xrange(11):
data[j] = randint(0, 10)
If the keys `i` and `j` are not "independent" you might use a "flat"
dictionary with a tuple of both as keys:
data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in xrange(11))
And just for completeness: The given data in the example can be stored in a
list of lists of course:
data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]
Ciao,
Marc 'BlackJack' Rintsch

Thanks for the heads up. Indeed it's just as nice as perl. One more
question though, this defaultdict seems to only work with python2.5+
in the case of python < 2.5 it seems I have to do something like:
#!/usr/bin/python
from random import randint
dict_dict = {}
for x in xrange(10):
for y in xrange(10):
r = randint(0,10)
try:
dict_dict[x][y] = r
except:
if x in dict_dict:
dict_dict[x][y] = r
else:
dict_dict[x] = {}
dict_dict[x][y] = r

You can clean that up a bit:

from random import randrange

dict_dict = {}
for x in xrange(10):
dict_dict[x] = dict((y, randrange(11)) for y in xrange(10))


what I really want to / need to be able to do is autoincrement the
values when I hit another word. Again in perl I'd just do something
like:
my %my_hash;
while(<FILE>){
chomp;
@_ = split(/\s+/);
grep{$my_hash{$_}++} @_;
}
and this generalizes transparently to a hash of hashes or hash of a
hash of hashes etc. In python < 2.5 this seems to require something
like:
for line in file:
words = line.split()
for word in words:
my_dict[word] = 1 + my_dict.get(word, 0)
which I guess I can generalize to a dict of dicts but it seems it will
require more if/else statements to check whether or not the higher-
level keys exist. I guess the real answer is that I should just
migrate to python2.5...!

Well, there's also dict.setdefault()
pairs = ["ab", "ab", "ac", "bc"]
outer = {}
for a, b in pairs:

... inner = outer.setdefault(a, {})
... inner = inner.get(b, 0) + 1
...>>> outer

{'a': {'c': 1, 'b': 2}, 'b': {'c': 1}}

and it's not hard to write your own defaultdict

... def __getitem__(self, key):
... return self.get(key, 0)
...>>> d = Dict()
for c in "abbbcdeafgh": d[c] += 1 ...
d

{'a': 2, 'c': 1, 'b': 3, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'h': 1}

Peter


One last question. I've heard the 'Explicit vs. Implicit' argument
but this seems to boil down to a question of general usage case
scenarios and what most people 'expect' for default behavior. The
above defaultdict implementation defining the __getitem__ method seems
like it is more generally useful than the real default. What is the
reasoning behind NOT using this as the default implementation for a
dict in python?
 
M

Marc 'BlackJack' Rintsch

Well, there's also dict.setdefault()
pairs = ["ab", "ab", "ac", "bc"]
outer = {}
for a, b in pairs:

... inner = outer.setdefault(a, {})
... inner = inner.get(b, 0) + 1
...>>> outer

{'a': {'c': 1, 'b': 2}, 'b': {'c': 1}}

and it's not hard to write your own defaultdict
class Dict(dict):

... def __getitem__(self, key):
... return self.get(key, 0)
...>>> d = Dict()
for c in "abbbcdeafgh": d[c] += 1 ...
d

{'a': 2, 'c': 1, 'b': 3, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'h': 1}

Peter


One last question. I've heard the 'Explicit vs. Implicit' argument
but this seems to boil down to a question of general usage case
scenarios and what most people 'expect' for default behavior. The
above defaultdict implementation defining the __getitem__ method seems
like it is more generally useful than the real default. What is the
reasoning behind NOT using this as the default implementation for a
dict in python?


How's that more useful in the general case? Maybe if you come from a
language where some default value pops up if the key is not present you
are used to write code in a way that exploits this fact. But in the
general case!? I need `defaultdict` not very often but want to know if a
key is not present in a dictionary. Because most of the time that's a
special condition or error that has to be handled or signaled up the call
chain.

Ciao,
Marc 'BlackJack' Rintsch
 
K

Kay Schluehr

Hi,
I'm wondering what the best practice is for creating an extensible
dictionary-of-dictionaries in python?

In perl I would just do something like:

my %hash_of_hashes;
for(my $i=0;$i<10;$i++){
for(my $j=0;$j<10;$j++){
${$hash_of_hashes{$i}}{$j} = int(rand(10));
}

}

but it seems to be more hassle to replicate this in python. I've
found a couple of references around the web but they seem cumbersome.
I'd like something compact.
-joe

You might produce the behaviour of the hash_of_hashes type directly:

class dict_of_dicts(dict):
def __getitem__(self, key):
d = dict.get(self, key, {})
self[key] = d
return d

def __setitem__(self, key, val):
assert isinstance(val, dict), "Value of type dict expected. %s
found instead."%(type(val))
dict.__setitem__(self, key, val)

d = dict_of_dicts()
d[0][1] = "A"
d[1][1] = "B"
d[1][2] = "C"
d {0: {1: 'A'}, 1: {1: 'B', 2: 'C'}}
d[0] = 0 # expects values of type dict
Traceback
....
AssertionError: Value of type dict expected. <type 'int'> found
instead.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top