J
Jim
Hello,
I'm trying to do urllib.urlencode() with unicode correctly, and I
wonder if some kind person could set me straight?
My understanding is that I am supposed to be able to urlencode anything
up to the top half of latin-1 -- decimal 128-255.
I can't just send urlencode a unicode character:
Python 2.3.5 (#2, May 4 2005, 08:51:39)
[GCC 3.3.5 (Debian 1:3.3.5-12)] on linux2
Type "help", "copyright", "credits" or "license" for more information.Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.3/urllib.py", line 1206, in urlencode
v = quote_plus(str(v))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in
position 3: ordinal not in range(128)
Is it instead Right that I should send a unicode string to urlencode by
first encoding it to 'latin-1' ?
'x=abc%F6def'
If it is Right, I'm puzzled as to why urlencode doesn't do it. Or am I
missing something? urllib.ulrencode() contains the lines:
elif _is_unicode(v):
# is there a reasonable way to convert to ASCII?
# encode generates a string, but "replace" or "ignore"
# lose information and "strict" can raise UnicodeError
v = quote_plus(v.encode("ASCII","replace"))
l.append(k + '=' + v)
so I think that it is *not* liking latin-1.
Thank you,
Jim
I'm trying to do urllib.urlencode() with unicode correctly, and I
wonder if some kind person could set me straight?
My understanding is that I am supposed to be able to urlencode anything
up to the top half of latin-1 -- decimal 128-255.
I can't just send urlencode a unicode character:
Python 2.3.5 (#2, May 4 2005, 08:51:39)
[GCC 3.3.5 (Debian 1:3.3.5-12)] on linux2
Type "help", "copyright", "credits" or "license" for more information.Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.3/urllib.py", line 1206, in urlencode
v = quote_plus(str(v))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in
position 3: ordinal not in range(128)
Is it instead Right that I should send a unicode string to urlencode by
first encoding it to 'latin-1' ?
'x=abc%F6def'
If it is Right, I'm puzzled as to why urlencode doesn't do it. Or am I
missing something? urllib.ulrencode() contains the lines:
elif _is_unicode(v):
# is there a reasonable way to convert to ASCII?
# encode generates a string, but "replace" or "ignore"
# lose information and "strict" can raise UnicodeError
v = quote_plus(v.encode("ASCII","replace"))
l.append(k + '=' + v)
so I think that it is *not* liking latin-1.
Thank you,
Jim