Odd unicode() behavior

maport · Aug 30, 2006

The behavior of the unicode built-in function when given a unicode
string seems a little odd to me:

u'abc'

Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: decoding Unicode is not supported

I don't see why providing the encoding should make the function behave
differently when given a Unicode string. Surely unicode(s) ought to
bahave exactly the same as unicode(s,sys.getdefaultencoding())?

Any opinions?

Mike.

Fredrik Lundh · Aug 30, 2006

The behavior of the unicode built-in function when given a unicode
string seems a little odd to me:

Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: decoding Unicode is not supported

I don't see why providing the encoding should make the function behave
differently when given a Unicode string. Surely unicode(s) ought to
bahave exactly the same as unicode(s,sys.getdefaultencoding())?

nope.

if you omit the encoding argument, unicode() behaves pretty much like str(),
using either the __unicode__ method or __str__/__repr__ + decoding to get
a Unicode string.

see the language reference for details, e.g:

http://pyref.infogami.com/unicode

</F>

"Decoding unicode is not supported" in unusual situation	10	Mar 7, 2012
__unicode__() works, unicode() blows up.	3	Nov 4, 2012
unicode question	2	Feb 25, 2006
Ascii to Unicode.	4	Jul 28, 2010
Python Unicode handling wins again -- mostly	67	Nov 30, 2013
What the \xc2\xa0 ?!!	1	Sep 7, 2010
Python and unicode	8	Sep 19, 2010
Doubt on generators behavior	2	Oct 13, 2013

Odd unicode() behavior

maport

Fredrik Lundh

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads