[2.5.1] "UnicodeDecodeError: 'ascii' codec can't decode byte"?

G

Gilles Ganault

Hello

I'm getting this error while downloading and parsing web pages:

=====
title = m.group(1)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position
48: ordinal not in range(128)
=====

From what I understand, it's because some strings are Unicode, and
hence contain characters that are illegal in ASCII.

Does someone know how to solve this error?

Thank you.
 
U

Ulrich Eckhardt

Gilles said:
I'm getting this error while downloading and parsing web pages:

=====
title = m.group(1)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position
48: ordinal not in range(128)
=====

From what I understand, it's because some strings are Unicode, and
hence contain characters that are illegal in ASCII.

You just need to use a codec according to the encoding of the webpage. Take
a look at
http://wiki.python.org/moin/Python3UnicodeDecodeError
It is about Python 3, but the principles apply nonetheless. In any case,
throwing the error at a websearch will turn up lots of solutions.

Uli
 
S

Steve Holden

Ulrich said:
You just need to use a codec according to the encoding of the webpage. Take
a look at
http://wiki.python.org/moin/Python3UnicodeDecodeError
It is about Python 3, but the principles apply nonetheless. In any case,
throwing the error at a websearch will turn up lots of solutions.
I won't believe that statement is producing the error until I see a
traceback. As far as I'm aware the re module can handle Unicode. Getting
a UnicodeDecodeError in an assignment would be unusual to say the least.
Though it's not, I suppose, impossible that calling the .group() method
of a match object might, it seems unlikely.

regards
Steve
 
S

Steve Holden

Ulrich said:
You just need to use a codec according to the encoding of the webpage. Take
a look at
http://wiki.python.org/moin/Python3UnicodeDecodeError
It is about Python 3, but the principles apply nonetheless. In any case,
throwing the error at a websearch will turn up lots of solutions.
I won't believe that statement is producing the error until I see a
traceback. As far as I'm aware the re module can handle Unicode. Getting
a UnicodeDecodeError in an assignment would be unusual to say the least.
Though it's not, I suppose, impossible that calling the .group() method
of a match object might, it seems unlikely.

regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,902
Latest member
Elena68X5

Latest Threads

Top