Unicode in MIMEText

D

Damjan

Why doesn't this work:

from email.MIMEText import MIMEText
msg = MIMEText(u'\u043a\u0438\u0440\u0438\u043b\u0438\u0446\u0430')
msg.set_charset('utf-8')
msg.as_string()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.4/email/Message.py", line 129, in as_string
g.flatten(self, unixfrom=unixfrom)
File "/usr/lib/python2.4/email/Generator.py", line 82, in flatten
self._write(msg)
File "/usr/lib/python2.4/email/Generator.py", line 113, in _write
self._dispatch(msg)
File "/usr/lib/python2.4/email/Generator.py", line 139, in _dispatch
meth(msg)
File "/usr/lib/python2.4/email/Generator.py", line 180, in _handle_text
payload = cset.body_encode(payload)
File "/usr/lib/python2.4/email/Charset.py", line 366, in body_encode
return email.base64MIME.body_encode(s)
File "/usr/lib/python2.4/email/base64MIME.py", line 136, in encode
enc = b2a_base64(s[i:i + max_unencoded])
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7:
ordinal not in range(128)
 
D

Damjan

Why doesn't this work:
from email.MIMEText import MIMEText
msg = MIMEText(u'\u043a\u0438\u0440\u0438\u043b\u0438\u0446\u0430')
msg.set_charset('utf-8')
msg.as_string() ....
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7:
ordinal not in range(128)

It's a real shame that unicode support in the python library is very weak
sometimes...

Anyway I solved my problem by patching email.Charset

--- Charset.py~ 2005-11-24 04:20:09.000000000 +0100
+++ Charset.py 2005-11-24 04:21:02.000000000 +0100
@@ -244,6 +244,8 @@
"""Convert a string from the input_codec to the output_codec."""
if self.input_codec <> self.output_codec:
return unicode(s, self.input_codec).encode(self.output_codec)
+ elif isinstance(s, unicode):
+ return s.encode(self.output_codec)
else:
return s
 
S

Steve Holden

Damjan said:
It's a real shame that unicode support in the python library is very weak
sometimes...

Anyway I solved my problem by patching email.Charset

--- Charset.py~ 2005-11-24 04:20:09.000000000 +0100
+++ Charset.py 2005-11-24 04:21:02.000000000 +0100
@@ -244,6 +244,8 @@
"""Convert a string from the input_codec to the output_codec."""
if self.input_codec <> self.output_codec:
return unicode(s, self.input_codec).encode(self.output_codec)
+ elif isinstance(s, unicode):
+ return s.encode(self.output_codec)
else:
return s
.... and being concerned to improve the library you logged this patch in
Sourceforge for consideration by the developers?

That's the only way to guarantee proper consideration of your fix.

regards
Steve
 
D

Damjan

... and being concerned to improve the library you logged this patch in
Sourceforge for consideration by the developers?

That's the only way to guarantee proper consideration of your fix.

Ok I will, can you confirm that the patch is correct?
Maybe I got something wrong?
 
S

Steve Holden

Damjan said:
Ok I will, can you confirm that the patch is correct?
Maybe I got something wrong?
I can't confirm its correctness but I can say it looks reasonable enough
to submit as a path. The fact that you have identified an issue and a
possible fix is quite enough to allow you to submit the patch.

The adequacy of the patch will ultimately be decided by the maintainer
who considers your submission (in all probability Barry Warsaw, but not
necessarily).

Thanks for taking the time to improve the quality of the Python library.

regards
Steve
 
D

Damjan

patch submitted...
Thanks for taking the time to improve the quality of the Python library.

Do you think it would be possible to do some kind of an automatic
comprehensive test of compatibility of the standard library with unicode
strings?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top