UnicodeEncodeError - a bit out of my element...

E

erikcw

Hi all,

I'm trying to parse an email message, but am running into this
exception.

Traceback (most recent call last):
File "wa.py", line 336, in ?
main()
File "wa.py", line 332, in main
print out['msg']
UnicodeEncodeError: 'ascii' codec can't encode character u'\xd6' in
position 238: ordinal not in range(128)

How can I decode/encode this string to print to stdout and send again
in another email? Do I have to know what language the email is in?

Thanks!
Erik
 
L

liupeng

I cut from Sam Python Phrasebook
"Converting Unicode to Local Strings"

import string locStr = "El "
uniStr = u"Ni\u00F1o"
print uniStr.encode('utf-8')
print uniStr.encode('utf-16')
print uniStr.encode('iso-8859-1')
#Combine local and unicode results
#in new unicode string
newStr = locStr+uniStr
print newStr.encode('iso-8859-1')
#ascii will error because character '\xF1'
#is out of range
asciiStr = newStr.encode('iso-8859-1')
asciiStr =asciiStr.translate(\
string.maketrans('\xF1','n'), '')
print asciiStr.encode('ascii')
print newStr.encode('ascii')

unicode_str.py

Niño
ÿN|I|ñ|o
Niño
El Niño
El Nino
Traceback (most recent call last):
File "C:\books\python\CH2\code\unicode_str.py",
line 19, in ?
print newStr.encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode
character u'\xf1' in position 5: ordinal not in
range(128)

Hi all,

I'm trying to parse an email message, but am running into this
exception.

Traceback (most recent call last):
File "wa.py", line 336, in ?
main()
File "wa.py", line 332, in main
print out['msg']
UnicodeEncodeError: 'ascii' codec can't encode character u'\xd6' in
position 238: ordinal not in range(128)

How can I decode/encode this string to print to stdout and send again
in another email? Do I have to know what language the email is in?

Thanks!
Erik

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFGHQBTKeZs4sOrPzgRAtz6AJ0fivVr4nM/XKK8rN+JMSq+yQJEewCdHkVU
ZGUTAD9I5FqLbdAFNsG8ZYw=
=3Nek
-----END PGP SIGNATURE-----
 
K

kyosohma

Hi all,

I'm trying to parse an email message, but am running into this
exception.

Traceback (most recent call last):
File "wa.py", line 336, in ?
main()
File "wa.py", line 332, in main
print out['msg']
UnicodeEncodeError: 'ascii' codec can't encode character u'\xd6' in
position 238: ordinal not in range(128)

How can I decode/encode this string to print to stdout and send again
in another email? Do I have to know what language the email is in?

Thanks!
Erik

You'll need to do some encoding/decoding work. Check out the following
links on unicode:

http://www.reportlab.com/i18n/python_unicode_tutorial.html
http://www.amk.ca/python/howto/unicode
http://www.jorendorff.com/articles/unicode/python.html

And here's a few links on parsing:

http://docs.python.org/api/arg-parsing.html
http://www.diveintopython.org/xml_processing/unicode.html

Probably more information than you need, but it should help answer
your question (and maybe any future questions about Unicode).

Mike
 
E

erikcw

I cut from Sam Python Phrasebook
"Converting Unicode to Local Strings"

import string locStr = "El "
uniStr = u"Ni\u00F1o"
print uniStr.encode('utf-8')
print uniStr.encode('utf-16')
print uniStr.encode('iso-8859-1')
#Combine local and unicode results
#in new unicode string
newStr = locStr+uniStr
print newStr.encode('iso-8859-1')
#ascii will error because character '\xF1'
#is out of range
asciiStr = newStr.encode('iso-8859-1')
asciiStr =asciiStr.translate(\
string.maketrans('\xF1','n'), '')
print asciiStr.encode('ascii')
print newStr.encode('ascii')

unicode_str.py

Niño
ÿN|I|ñ|o
Niño
El Niño
El Nino
Traceback (most recent call last):
File "C:\books\python\CH2\code\unicode_str.py",
line 19, in ?
print newStr.encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode
character u'\xf1' in position 5: ordinal not in
range(128)

I'm trying to parse an email message, but am running into this
exception.
Traceback (most recent call last):
File "wa.py", line 336, in ?
main()
File "wa.py", line 332, in main
print out['msg']
UnicodeEncodeError: 'ascii' codec can't encode character u'\xd6' in
position 238: ordinal not in range(128)
How can I decode/encode this string to print to stdout and send again
in another email? Do I have to know what language the email is in?



signature.asc
1KDownload

I used the .encode("utf-8") method on the string and it fixed
everything! Thanks for your help!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top