UnicodeEncodeError - opening encoded URLs

D

D4rko

Hi!

I have a problem with urllib2 open() function. My application is
receiving the following request - as I can see in the developement
server console it is properly encoded:

[27/Mar/2009 22:22:29] "GET /[blahblah]/Europa_%C5%9Arodkowa/5 HTTP/
1.1" 500 54572

Then it uses this request parameter as name variable to build
wikipedia link, and tries to acces it with following code:

url = u'http://pl.wikipedia.org/w/index.php?title=' + name +
'&printable=yes'
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
wikipage = opener.open(url)

Unfortunately, the last line fails with the exception:
UnicodeEncodeError 'ascii' codec can't encode character u'\u015a' in
position 30: ordinal not in range(128). Using urlencode(url) results
in TypeError "not a valid non-string sequence or mapping object", and
quote(url) fails because of KeyError u'\u015a' . How can I properly
parse this request to make it work (ie. acces
http://pl.wikipedia.org/wiki/Europa_Środkowa)?
 
M

Matt Nordhoff

D4rko said:
Hi!

I have a problem with urllib2 open() function. My application is
receiving the following request - as I can see in the developement
server console it is properly encoded:

[27/Mar/2009 22:22:29] "GET /[blahblah]/Europa_%C5%9Arodkowa/5 HTTP/
1.1" 500 54572

Then it uses this request parameter as name variable to build
wikipedia link, and tries to acces it with following code:

url = u'http://pl.wikipedia.org/w/index.php?title=' + name +
'&printable=yes'
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
wikipage = opener.open(url)

Unfortunately, the last line fails with the exception:
UnicodeEncodeError 'ascii' codec can't encode character u'\u015a' in
position 30: ordinal not in range(128). Using urlencode(url) results
in TypeError "not a valid non-string sequence or mapping object", and
quote(url) fails because of KeyError u'\u015a' . How can I properly
parse this request to make it work (ie. acces
http://pl.wikipedia.org/wiki/Europa_Środkowa)?

What if you just used a regular byte string for the URL?
'&printable=yes'

(Unless "name" is a unicode object as well.)

(Nice user-agent, BTW. :p )
--
 
D

D4rko

(Unless "name" is a unicode object as well.)

Unfortunately it is, it's the argument that is automagically handed to
the handler function by the Django URL dispatcher. I guess I may need
to encode it back to the pure ascii with the "%xx" things, but I can't
find the function that would do it. Any idea?
 
P

Peter Otten

D4rko said:
Unfortunately it is, it's the argument that is automagically handed to
the handler function by the Django URL dispatcher. I guess I may need
to encode it back to the pure ascii with the "%xx" things, but I can't
find the function that would do it. Any idea?
'%C5%9Arodkowa'

Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,767
Messages
2,569,571
Members
45,045
Latest member
DRCM

Latest Threads

Top