unicode keys in dicts

J

Jiba

Hi all,

is the following behaviour normal :
d = {"é" : 1}
d["é"] 1
d[u"é"]
Traceback (most recent call last):
File "<stdin>", line 1, in ?
KeyError: u'\xe9'


it seems that "é" and u"é" are not considered as the same key (in Python
2.3.3). Though they have the same hash code (returned by hash()).

And "e" and u"e" (non accentuated characters) are considered as the same
!

Jiba
 
J

Jeff Epler

chr(0xe9) == unichr(0xe9)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeError: ASCII decoding error: ordinal not in range(128)

unequal objects can hash to the same value. Your two keys are not
equal (in fact, you can't even compare them on my system). They would
be comparable but not equal on many systems, for instance one where the
system's encoding is Microsoft's CP850.

You can misconfigure your system to assume that byte strings are in (eg)
iso-8859-1 encoding by changing site.py.

Jeff
 
P

Peter Hansen

Jiba said:
is the following behaviour normal :
d = {"é" : 1}
d["é"] 1
d[u"é"]
Traceback (most recent call last):
File "<stdin>", line 1, in ?
KeyError: u'\xe9'

it seems that "é" and u"é" are not considered as the same key (in Python
2.3.3). Though they have the same hash code (returned by hash()).

And "e" and u"e" (non accentuated characters) are considered as the same
!

Well, "e" and u"e" _are_ the same character, while the unicode that comes
from decoding the "é" representation is entirely dependent on which codec
you use for the decoding. It is only the same as u"é" when decoded using
certain codecs, most likely. ASCII is 7-bit only, so the "é" value is
not legal in ASCII, which is likely your default encoding.

For example, try "é".decode('iso-8859-1') and you will probably get the
unicode value you were expecting.

I'm not the best to answer this, but I would at least say that the above
behaviour is considered "normal", though it can be surprising to those
of us not expert in Unicode issues...

-Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top