Are dictionaries the same as hashtables?

M

Martin Marcher

Are dictionaries the same as hashtables?

Yes, but there is nothing in there that does sane collision handling
like making a list instead of simply overwriting.

PS: your sig was *a bit* longer than you question. please don't do
that...

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iQIVAwUBSLO0KNjk1vit/YR7AQLAxhAAotF4xEbquq1C+fnVU93PhETiotFSw/ni
r32njO5MPu2ecyY4eOxGshzw6P2ez1i9K3Yk8bpm89FrsZeqnrdDhh8xFo7CKFjX
DeoBVE3x6jo4Q9iTvNpF6tUHCAgyxVuaV/F+Hz9mNdaJ5o+xxG474429Pumn1N4W
m5X4wk5BtIj1nV7R4xh+jOFDXFShSoCZBwyVGB+MIZUsQhAt4qN9MSuO3lhmNgtD
Eo03CVpdqiKcxqS77xnHIR8TtgahLyRAosMBenNCe5ELE7CALkVU/v4oNI+4G3Ju
n0mfkZA2iqWulcJD3QAdYwImFlXrlLYg6Y5OMvL/Tm4729htsvGBPZD0DSrR0rQs
tjgRIljhv3BwKoxsk/UUx2glbUShjEfgCimIePKmuSh7jN+CuCVsgaRj5zYGiCVQ
gintIQaiFheqjkpSfkRtLqBQBbLwzCS86Bzc++I0m+IENxExvuE3AY4Lmt5dVCAT
ox8TubGe50xgomWr9ms/szri9u5qYuUWzC2ByjUVauuGAQxHwQUQvjCBFM9TI6bc
do+TMTY8qJhb/qsW3TlX5HagW/Cn+jw+xwP8b+xeMNXAEA/Q3dwSKFTx1s5NwGg6
vvjwEdPooq3k2XEQFXZq12CUn0FYjhUAzwR/06JG6muAR0YEMHKGeqcuamuEKeU/
2VBXpaqVkPY=
=7TsX
-----END PGP SIGNATURE-----
 
C

cnb

Yes, but there is nothing in there that does sane collision handling
like making a list instead of simply overwriting.

PS: your sig was *a bit* longer than you question. please don't do
that...

 signature.asc
< 1KViewDownload

my what?
 
J

John Machin

Yes, but there is nothing in there that does sane collision handling
like making a list instead of simply overwriting.

Please clarify: (1) Nothing in where? In the Python dictionary
implementation? (2) What do you mean by "sane collision handling"?
What do you mean by "simply overwriting"?

TIA,
John
 
D

Diez B. Roggisch

Martin said:
Yes, but there is nothing in there that does sane collision handling
like making a list instead of simply overwriting.

The term "collision" is rather well defined when talking about associative
arrays using hashing (which python's dicts are): it means that two distinct
keys produce the same hash-value, leading to a bucket collision. And of
course python deals with that sanely, and creates a list of objects inside
the bucket which is traversed and comparison is used to determine which
actual value to retrieve.

Python does not have a "one key maps to a list of values"-semantics - which
I consider the sane choice...

However, you can have that using the defaultdict for example:

listdict = defaultdict(list)

listdict[key].append(value)

Diez
 
J

John Machin

The term "collision" is rather well defined when talking about associative
arrays using hashing (which python's dicts are): it means that two distinct
keys produce the same hash-value, leading to a bucket collision. And of
course python deals with that sanely, and creates a list of objects inside
the bucket which is traversed and comparison is used to determine which
actual value to retrieve.

I see neither buckets nor lists in the CPython svn trunk version of
dictobject.c and IIRC it's been using open addressing for a very long
time.
 
D

Diez B. Roggisch

John said:
I see neither buckets nor lists in the CPython svn trunk version of
dictobject.c and IIRC it's been using open addressing for a very long
time.

I don't know the exact names of the involved structures - I named them
liberally from my understanding of how associative arrays based on hashing
are implemented. But the below code shows that hash-collisions can occur
without corrupting data, while OTOH of course degenerating the lookup from
usually O(1) to O(n). Try playing around with it, commenting out the proper
hashing of the key will lead to great performance enhancements.

Diez

import pprint

class StupidThing(object):

def __init__(self, key):
self.key = key


def __hash__(self):
#return hash(self.key)
return 1


def __cmp__(self, other):
return cmp(self.key, other.key)


def __repr__(self):
return "<%s>" % self.key


d = {}

for a in xrange(1000):
d[StupidThing(str(a))] = a



pprint.pprint(d)
 
M

Martin


Hehe,

it was *my* sig... i was using some old box with a mut config that
still had the fortune script in it... sorry.

Anyway what I meant by collision handling was:

$ python
Python 2.5.2 (r252:60911, Jul 31 2008, 17:31:22)
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
l = {}
l {}
l["a"] = 12
l["b"] = 13
l {'a': 12, 'b': 13}
l["a"] = "This is a collision"
l {'a': 'This is a collision', 'b': 13}

As you see position "a" is simply overwritten. Probably "sane" doesn't
really apply because how could the python creators possibly know
whether I just want to overwrite the value or indeed want some form of
collision handling (talk about lists vs. trees with there
subforms...)...

If one would want that he/she/it could still subclass dict and do more magic.

/martin

--
http://www.xing.com/profile/Martin_Marcher

You are not free to read this message,
by doing so, you have violated my licence
and are required to urinate publicly. Thank you.
 
F

Fredrik Lundh

Martin said:
Yes, but there is nothing in there that does sane collision handling
like making a list instead of simply overwriting.

are you sure you know what "collision handling" means in this context?

</F>
 
F

Fredrik Lundh

Diez said:
I don't know the exact names of the involved structures - I named them
liberally from my understanding of how associative arrays based on hashing
are implemented. But the below code shows that hash-collisions can occur
without corrupting data

http://en.wikipedia.org/wiki/Open_addressing

"Open addressing, or closed hashing, is a method of collision resolution
in hash tables. With this method a hash collision is resolved by
probing, or searching through alternate locations in the array (the
probe sequence) until either the target record is found, or an unused
array slot is found, which indicates that there is no such key in the
table."

</F>
 
D

Diez B. Roggisch

Cameron said:
Martin said:
On 2008-08-26 00:32:20, cnb wrote:
Are dictionaries the same as hashtables?
.
.
.
Python does not have a "one key maps to a list of values"-semantics -
which I consider the sane choice...

However, you can have that using the defaultdict for example:

listdict = defaultdict(list)

listdict[key].append(value)

Diez

? I'm lost. As I understand your terms, Python's dictionaries
map keys to objects, but you would prefer that Python's
dictionaries map keys only to lists of values. That *sounds*
like a complexification, at best. Are you trying to make a
point about implementation aligning with semantics?

The OP seems to want that (or at least sees it as one of two viable design
choices), see his other answer in this thread.

I certainly *don't* agree with that, I merely pointed out that python comes
with means to easily create such a data-structure in the case it is needed.

Diez
 
C

Chris Mellon

are you sure you know what "collision handling" means in this context?

I think the confusion comes from thinking that dictionaries are
(really) hash tables and thus that things like collision handling are
exposed to the user of the data structure. In this sense, no,
dictionaries are *not* hash tables. They are mappings of keys to
values, and they use hash tables as part of their implementation.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,266
Messages
2,571,079
Members
48,772
Latest member
Backspace Studios

Latest Threads

Top