How do I decode unicode characters in the subject usingemail.message_from_string()?

Roy H. Han · Feb 25, 2009

Dear python-list,

I'm having some trouble decoding an email header using the standard
imaplib.IMAP4 class and email.message_from_string method.

In particular, email.message_from_string() does not seem to properly
decode unicode characters in the subject.

How do I decode unicode characters in the subject?

I read on the documentation that the email module supports RFC 2047.
But is there a way to make imaplib.IMAP4 and email.message_from_string
use this protocol? I'm using Python 2.5.2 on Fedora. Perhaps this
problem has been fixed already in Python 2.6.

RHH

John Machin · Feb 25, 2009

Dear python-list,

I'm having some trouble decoding an email header using the standard
imaplib.IMAP4 class and email.message_from_string method.

In particular, email.message_from_string() does not seem to properly
decode unicode characters in the subject.

How do I decode unicode characters in the subject?

You don't. You can't. You decode str objects into unicode objects. You
encode unicode objects into str objects. If your input is not a str
object, you have a problem.

I'm no expert on the email package, but experts don't have crystal
balls, so let's gather some data for them while we're waiting for
their timezones to align:

Presumably your code is doing something like:
msg = email.message_from_string(a_string)

Please report the results of
print repr(a_string)
and
print type(msg)
print msg.items()
and tell us what you expected.

Cheers,
John

rdmurray · Feb 25, 2009

John Machin said:
You don't. You can't. You decode str objects into unicode objects. You
encode unicode objects into str objects. If your input is not a str
object, you have a problem.

I can't speak for the OP, but I had a similar (and possibly
identical-in-intent) question. Suppose you have a Subject line that
looks like this:

Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?= =?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=

How do you get the email module to decode that into unicode? The same
question applies to the other header lines, and the answer is it isn't
easy, and I had to read and reread the docs and experiment for a while
to figure it out. I understand there's going to be a sprint on the
email module at pycon, maybe some of this will get improved then.

Here's the final version of my test program. The third to last line is
one I thought ought to work given that Header has a __unicode__ method.
The final line is the one that did work (note the kludge to turn None
into 'ascii'...IMO 'ascii' is what deocde_header _should_ be returning,
and this code shows why!)

-------------------------------------------------------------------
from email import message_from_string
from email.header import Header, decode_header

x = message_from_string("""\
To: test
Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?= =?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=

this is a test.
""")

print x
print "--------------------"
for key, header in x.items():
print key, 'type', type(header)
print key+":", unicode(Header(header)).decode('utf-8')
print key+":", decode_header(header)
print key+":", ''.join([s.decode(t or 'ascii') for (s, t) in decode_header(header)]).encode('utf-8')
-------------------------------------------------------------------

From nobody Wed Feb 25 08:35:29 2009
To: test
Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?=
=?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=

this is a test.

--------------------
To type <type 'str'>
To: test
To: [('test', None)]
To: test
Subject type <type 'str'>
Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?= =?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=
Subject: [("'u' Obselete type", None), ("-- it is identical to 'd'. (7)", 'iso-8859-1')]
Subject: 'u' Obselete type-- it is identical to 'd'. (7)

--RDM

Roy H. Han · Feb 25, 2009

Thanks for writing back, RDM and John Machin. Tomorrow I'll try the
code you suggested, RDM. It looks quite helpful and I'll report the
results.

In the meantime, John asked for more data. The sender's email client
is Microsoft Outlook 11. The recipient email client is Lotus Notes.

Actual Subject
=?us-ascii?Q?Inteum_C/SR_User_Tip:__Quick_Access_to_Recently_Opened_Inteu?=\r\n\t=?us-ascii?Q?m_C/SR_Records?=

Expected Subject
Inteum C/SR User Tip: Quick Access to Recently Opened Inteum C/SR Records

X-Mailer
Microsoft Office Outlook 11

X-MimeOLE
Produced By Microsoft MimeOLE V6.00.2900.5579

RHH

John Machin said:
John Machin said:

You don't. You can't. You decode str objects into unicode objects. You
encode unicode objects into str objects. If your input is not a str
object, you have a problem.

Click to expand...

I can't speak for the OP, but I had a similar (and possibly
identical-in-intent) question. Suppose you have a Subject line that
looks like this:

Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?= =?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=

How do you get the email module to decode that into unicode? The same
question applies to the other header lines, and the answer is it isn't
easy, and I had to read and reread the docs and experiment for a while
to figure it out. I understand there's going to be a sprint on the
email module at pycon, maybe some of this will get improved then.

Here's the final version of my test program. The third to last line is
one I thought ought to work given that Header has a __unicode__ method.
The final line is the one that did work (note the kludge to turn None
into 'ascii'...IMO 'ascii' is what deocde_header _should_ be returning,
and this code shows why!)

-------------------------------------------------------------------
from email import message_from_string
from email.header import Header, decode_header

x = message_from_string("""\
To: test
Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?= =?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=

this is a test.
""")

print x
print "--------------------"
for key, header in x.items():
print key, 'type', type(header)
print key+":", unicode(Header(header)).decode('utf-8')
print key+":", decode_header(header)
print key+":", ''.join([s.decode(t or 'ascii') for (s, t) in decode_header(header)]).encode('utf-8')
-------------------------------------------------------------------

From nobody Wed Feb 25 08:35:29 2009
To: test
Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?=
=?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=

this is a test.

--------------------
To type <type 'str'>
To: test
To: [('test', None)]
To: test
Subject type <type 'str'>
Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?= =?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=
Subject: [("'u' Obselete type", None), ("-- it is identical to 'd'.. (7)", 'iso-8859-1')]
Subject: 'u' Obselete type-- it is identical to 'd'. (7)

--RDM

Steve Holden · Feb 25, 2009

Roy said:
On Feb 25, 11:07=A0am, "Roy H. Han" <[email protected]>
wrote:
Dear python-list,

I'm having some trouble decoding an email header using the standard
imaplib.IMAP4 class and email.message_from_string method.

In particular, email.message_from_string() does not seem to properly
decode unicode characters in the subject.

How do I decode unicode characters in the subject?
You don't. You can't. You decode str objects into unicode objects. You
encode unicode objects into str objects. If your input is not a str
object, you have a problem.

Click to expand...

I can't speak for the OP, but I had a similar (and possibly
identical-in-intent) question. Suppose you have a Subject line that
looks like this:

Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?= =?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=

How do you get the email module to decode that into unicode? The same
question applies to the other header lines, and the answer is it isn't
easy, and I had to read and reread the docs and experiment for a while
to figure it out. I understand there's going to be a sprint on the
email module at pycon, maybe some of this will get improved then.

Here's the final version of my test program. The third to last line is
one I thought ought to work given that Header has a __unicode__ method.
The final line is the one that did work (note the kludge to turn None
into 'ascii'...IMO 'ascii' is what deocde_header _should_ be returning,
and this code shows why!)

-------------------------------------------------------------------
from email import message_from_string
from email.header import Header, decode_header

x = message_from_string("""\
To: test
Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?= =?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=

this is a test.
""")

print x
print "--------------------"
for key, header in x.items():
print key, 'type', type(header)
print key+":", unicode(Header(header)).decode('utf-8')
print key+":", decode_header(header)
print key+":", ''.join([s.decode(t or 'ascii') for (s, t) in decode_header(header)]).encode('utf-8')
-------------------------------------------------------------------

From nobody Wed Feb 25 08:35:29 2009
To: test
Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?=
=?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=

this is a test.

--------------------
To type <type 'str'>
To: test
To: [('test', None)]
To: test
Subject type <type 'str'>
Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?= =?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=
Subject: [("'u' Obselete type", None), ("-- it is identical to 'd'. (7)", 'iso-8859-1')]
Subject: 'u' Obselete type-- it is identical to 'd'. (7)

Thanks for writing back, RDM and John Machin. Tomorrow I'll try the
code you suggested, RDM. It looks quite helpful and I'll report the
results.

In the meantime, John asked for more data. The sender's email client
is Microsoft Outlook 11. The recipient email client is Lotus Notes.

Actual Subject
=?us-ascii?Q?Inteum_C/SR_User_Tip:__Quick_Access_to_Recently_Opened_Inteu?=\r\n\t=?us-ascii?Q?m_C/SR_Records?=

Expected Subject
Inteum C/SR User Tip: Quick Access to Recently Opened Inteum C/SR Records

X-Mailer
Microsoft Office Outlook 11

X-MimeOLE
Produced By Microsoft MimeOLE V6.00.2900.5579
[/QUOTE]
decode_header("=?us-ascii?Q?Inteum_C/SR_User_Tip:__Quick_Access_to_Recently_Opened_Inteu?=\r\n\t=?us-ascii?Q?m_C/SR_Records?=")
[('Inteum C/SR User Tip: Quick Access to Recently Opened Inteum C/SR
Records', 'us-ascii')]
regards
Steve

Steve Holden · Feb 25, 2009

Roy said:
On Feb 25, 11:07=A0am, "Roy H. Han" <[email protected]>
wrote:
Dear python-list,

I'm having some trouble decoding an email header using the standard
imaplib.IMAP4 class and email.message_from_string method.

In particular, email.message_from_string() does not seem to properly
decode unicode characters in the subject.

How do I decode unicode characters in the subject?
You don't. You can't. You decode str objects into unicode objects. You
encode unicode objects into str objects. If your input is not a str
object, you have a problem.

Click to expand...

I can't speak for the OP, but I had a similar (and possibly
identical-in-intent) question. Suppose you have a Subject line that
looks like this:

Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?= =?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=

How do you get the email module to decode that into unicode? The same
question applies to the other header lines, and the answer is it isn't
easy, and I had to read and reread the docs and experiment for a while
to figure it out. I understand there's going to be a sprint on the
email module at pycon, maybe some of this will get improved then.

Here's the final version of my test program. The third to last line is
one I thought ought to work given that Header has a __unicode__ method.
The final line is the one that did work (note the kludge to turn None
into 'ascii'...IMO 'ascii' is what deocde_header _should_ be returning,
and this code shows why!)

-------------------------------------------------------------------
from email import message_from_string
from email.header import Header, decode_header

x = message_from_string("""\
To: test
Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?= =?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=

this is a test.
""")

print x
print "--------------------"
for key, header in x.items():
print key, 'type', type(header)
print key+":", unicode(Header(header)).decode('utf-8')
print key+":", decode_header(header)
print key+":", ''.join([s.decode(t or 'ascii') for (s, t) in decode_header(header)]).encode('utf-8')
-------------------------------------------------------------------

From nobody Wed Feb 25 08:35:29 2009
To: test
Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?=
=?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=

this is a test.

--------------------
To type <type 'str'>
To: test
To: [('test', None)]
To: test
Subject type <type 'str'>
Subject: 'u' Obselete type =?ISO-8859-1?Q?--_it_is_identical_?= =?ISO-8859-1?Q?to_=27d=27=2E_=287=29?=
Subject: [("'u' Obselete type", None), ("-- it is identical to 'd'. (7)", 'iso-8859-1')]
Subject: 'u' Obselete type-- it is identical to 'd'. (7)

Thanks for writing back, RDM and John Machin. Tomorrow I'll try the
code you suggested, RDM. It looks quite helpful and I'll report the
results.

In the meantime, John asked for more data. The sender's email client
is Microsoft Outlook 11. The recipient email client is Lotus Notes.

Actual Subject
=?us-ascii?Q?Inteum_C/SR_User_Tip:__Quick_Access_to_Recently_Opened_Inteu?=\r\n\t=?us-ascii?Q?m_C/SR_Records?=

Expected Subject
Inteum C/SR User Tip: Quick Access to Recently Opened Inteum C/SR Records

X-Mailer
Microsoft Office Outlook 11

X-MimeOLE
Produced By Microsoft MimeOLE V6.00.2900.5579
[/QUOTE]
decode_header("=?us-ascii?Q?Inteum_C/SR_User_Tip:__Quick_Access_to_Recently_Opened_Inteu?=\r\n\t=?us-ascii?Q?m_C/SR_Records?=")
[('Inteum C/SR User Tip: Quick Access to Recently Opened Inteum C/SR
Records', 'us-ascii')]
regards
Steve

rdmurray · Feb 25, 2009

Steve Holden said:
decode_header("=?us-ascii?Q?Inteum_C/SR_User_Tip:__Quick_Access_to_Recently_Opened_Inteu?=\r\n\t=?us-ascii?Q?m_C/SR_Records?=")
[('Inteum C/SR User Tip: Quick Access to Recently Opened Inteum C/SR
Records', 'us-ascii')]

It is interesting that decode_header does what I would consider to be
the right thing (from a pragmatic standpoint) with that particular bit
of Microsoft not-quite-standards-compliant brain-damage; but, removing
the tab is not in fact standards compliant if I'm reading the RFC
correctly.

--RDM

Roy H. Han · Feb 25, 2009

Cool, it works!

Thanks, RDM, for stating the right approach.
Thanks, Steve, for teaching by example.

I wonder why the email.message_from_string() method doesn't call
email.header.decode_header() automatically.

Steve Holden said:
Steve Holden said:

from email.header import decode_header
print

Click to expand...

decode_header("=?us-ascii?Q?Inteum_C/SR_User_Tip:__Quick_Access_to_Recently_Opened_Inteu?=\r\n\t=?us-ascii?Q?m_C/SR_Records?=")
[('Inteum C/SR User Tip: Quick Access to Recently Opened Inteum C/SR
Records', 'us-ascii')]

Click to expand...

It is interesting that decode_header does what I would consider to be
the right thing (from a pragmatic standpoint) with that particular bit
of Microsoft not-quite-standards-compliant brain-damage; but, removing
the tab is not in fact standards compliant if I'm reading the RFC
correctly.

--RDM

Steve Holden · Feb 25, 2009

Steve Holden said:
Steve Holden said:

from email.header import decode_header
print

Click to expand...

decode_header("=?us-ascii?Q?Inteum_C/SR_User_Tip:__Quick_Access_to_Recently_Opened_Inteu?=\r\n\t=?us-ascii?Q?m_C/SR_Records?=")
[('Inteum C/SR User Tip: Quick Access to Recently Opened Inteum C/SR
Records', 'us-ascii')]

Click to expand...

It is interesting that decode_header does what I would consider to be
the right thing (from a pragmatic standpoint) with that particular bit
of Microsoft not-quite-standards-compliant brain-damage; but, removing
the tab is not in fact standards compliant if I'm reading the RFC
correctly.

You'd need to quote me chapter and verse on that. I understood that the
tab simply indicated continuation, but it's a *long* time since I read
the RFCs.

regards
Steve

Thorsten Kampe · Feb 25, 2009

* Roy H. Han (Wed, 25 Feb 2009 10:17:22 -0500)

Thanks, RDM, for stating the right approach.
Thanks, Steve, for teaching by example.

I wonder why the email.message_from_string() method doesn't call
email.header.decode_header() automatically.

And I wonder why you would think the header contains Unicode characters
when it says "us-ascii" ("=?us-ascii?Q?"). I think there is a tendency
to label everything "Unicode" someone does not understand.

Thorsten

Gabriel Genellina · Feb 25, 2009

En Wed, 25 Feb 2009 13:40:31 -0200, Thorsten Kampe

* Roy H. Han (Wed, 25 Feb 2009 10:17:22 -0500)

And I wonder why you would think the header contains Unicode characters
when it says "us-ascii" ("=?us-ascii?Q?"). I think there is a tendency
to label everything "Unicode" someone does not understand.

And I wonder why you would think the header does *not* contain Unicode
characters when it says "us-ascii"?. I think there is a tendency here
too...

Thorsten Kampe · Feb 25, 2009

* Gabriel Genellina (Wed, 25 Feb 2009 14:00:16 -0200)

En Wed, 25 Feb 2009 13:40:31 -0200, Thorsten Kampe

And I wonder why you would think the header does *not* contain Unicode
characters when it says "us-ascii"?.

Basically because it didn't contain any Unicode characters (anything
outside the ASCII range).

Thorsten

Tim Golden · Feb 25, 2009

Thorsten said:
* Gabriel Genellina (Wed, 25 Feb 2009 14:00:16 -0200)

Basically because it didn't contain any Unicode characters (anything
outside the ASCII range).

And I imagine that Gabriel's point was -- and my point certainly
is -- that Unicode includes all the characters *inside* the
ASCII range.

TJG

Gabriel Genellina · Feb 25, 2009

En Wed, 25 Feb 2009 15:01:08 -0200, Thorsten Kampe

* Gabriel Genellina (Wed, 25 Feb 2009 14:00:16 -0200)

Basically because it didn't contain any Unicode characters (anything
outside the ASCII range).

I think you have to revise your definition of "Unicode".

rdmurray · Feb 25, 2009

Steve Holden said:
Steve Holden said:

from email.header import decode_header
print
decode_header("=?us-ascii?Q?Inteum_C/SR_User_Tip:__Quick_Access_to_Recently_Opened_Inteu?=\r\n\t=?us-ascii?Q?m_C/SR_Records?=")
[('Inteum C/SR User Tip: Quick Access to Recently Opened Inteum C/SR
Records', 'us-ascii')]

Click to expand...

It is interesting that decode_header does what I would consider to be
the right thing (from a pragmatic standpoint) with that particular bit
of Microsoft not-quite-standards-compliant brain-damage; but, removing
the tab is not in fact standards compliant if I'm reading the RFC
correctly.

Click to expand...

You'd need to quote me chapter and verse on that. I understood that the
tab simply indicated continuation, but it's a *long* time since I read
the RFCs.

Tab is not mentioned in RFC 2822 except to say that it is a valid
whitespace character. Header folding (insertion of <cr><lf>) can
occur most places whitespace appears, and is defined in section
2.2.3 thusly:

Each header field is logically a single line of characters comprising
the field name, the colon, and the field body. For convenience
however, and to deal with the 998/78 character limitations per line,
the field body portion of a header field can be split into a multiple
line representation; this is called "folding". The general rule is
that wherever this standard allows for folding white space (not
simply WSP characters), a CRLF may be inserted before any WSP. For
example, the header field:

Subject: This is a test

can be represented as:

Subject: This
is a test

[irrelevant note elided]

The process of moving from this folded multiple-line representation
of a header field to its single line representation is called
"unfolding". Unfolding is accomplished by simply removing any CRLF
that is immediately followed by WSP. Each header field should be
treated in its unfolded form for further syntactic and semantic
evaluation.

So, the whitespace characters are supposed to be left unchanged
after unfolding.

--David

Steve Holden · Feb 25, 2009

The process of moving from this folded multiple-line representation
of a header field to its single line representation is called
"unfolding". Unfolding is accomplished by simply removing any CRLF
that is immediately followed by WSP. Each header field should be
treated in its unfolded form for further syntactic and semantic
evaluation.

So, the whitespace characters are supposed to be left unchanged
after unfolding.

That would certainly appear to be the case. Thanks.

regards
Steve

Thorsten Kampe · Feb 25, 2009

* Tim Golden (Wed, 25 Feb 2009 17:27:07 +0000)

Thorsten said:
Thorsten said:

* Gabriel Genellina (Wed, 25 Feb 2009 14:00:16 -0200)

En Wed, 25 Feb 2009 13:40:31 -0200, Thorsten Kampe [...]
And I wonder why you would think the header contains Unicode characters
when it says "us-ascii" ("=?us-ascii?Q?"). I think there is a tendency
to label everything "Unicode" someone does not understand.
And I wonder why you would think the header does *not* contain Unicode
characters when it says "us-ascii"?.

Click to expand...

Basically because it didn't contain any Unicode characters (anything
outside the ASCII range).

Click to expand...

And I imagine that Gabriel's point was -- and my point certainly
is -- that Unicode includes all the characters *inside* the
ASCII range.

I know that this was Gabriel's point. And my point was that Gabriel's
point was pointless. If you call any text (or character) "Unicode" then
the word "Unicode" is generalized to an extent where it doesn't mean
anything at all anymore and becomes a buzz word.

With the same reason you could call ASCII an Unicode encoding (which it
isn't) because all ASCII characters are Unicode characters (code
points). Only encodings that cover the full Unicode range can reasonably
be called Unicode encodings.

The OP just saw some "weird characters" in the email subject and thought
"I know. It looks weird. Must be Unicode". But it wasn't. It was good
ole ASCII - only Quoted Printable encoded.

Thorsten

Gabriel Genellina · Feb 25, 2009

En Wed said:
Tab is not mentioned in RFC 2822 except to say that it is a valid
whitespace character. Header folding (insertion of <cr><lf>) can
occur most places whitespace appears, and is defined in section
2.2.3 thusly: [...]
So, the whitespace characters are supposed to be left unchanged
after unfolding.

Yep, there is an old bug report sleeping in the tracker about this...

Gabriel Genellina · Feb 25, 2009

En Wed, 25 Feb 2009 16:19:35 -0200, Thorsten Kampe

Thorsten said:
* Tim Golden (Wed, 25 Feb 2009 17:27:07 +0000)

Thorsten said:

* Gabriel Genellina (Wed, 25 Feb 2009 14:00:16 -0200)
En Wed, 25 Feb 2009 13:40:31 -0200, Thorsten Kampe [...]
And I wonder why you would think the header contains Unicode characters
when it says "us-ascii" ("=?us-ascii?Q?"). I think there is a tendency
to label everything "Unicode" someone does not understand.
And I wonder why you would think the header does *not* contain Unicode
characters when it says "us-ascii"?.

Basically because it didn't contain any Unicode characters (anything
outside the ASCII range).

Click to expand...

And I imagine that Gabriel's point was -- and my point certainly
is -- that Unicode includes all the characters *inside* the
ASCII range.

Click to expand...

I know that this was Gabriel's point. And my point was that Gabriel's
point was pointless. If you call any text (or character) "Unicode" then
the word "Unicode" is generalized to an extent where it doesn't mean
anything at all anymore and becomes a buzz word.

If it's text, it should use Unicode. Maybe not now, but in a few years, it
will be totally unacceptable not to properly use Unicode to process
textual data.

With the same reason you could call ASCII an Unicode encoding (which it
isn't) because all ASCII characters are Unicode characters (code
points). Only encodings that cover the full Unicode range can reasonably
be called Unicode encodings.

Not at all. ASCII is as valid as character encoding ("coded character set"
as the Unicode guys like to say) as ISO 10646 (which covers the whole
range).

The OP just saw some "weird characters" in the email subject and thought
"I know. It looks weird. Must be Unicode". But it wasn't. It was good
ole ASCII - only Quoted Printable encoded.

Good f*cked ASCII is Unicode too.

Unicode Chars in Windows Path	12	Apr 3, 2014
Trouble with UnicodeEncodeError and email	0	Jan 8, 2014
Unicode conversion problem (codec can't decode)	2	Apr 4, 2008
How do I encode and decode this data to write to a file?	11	Apr 29, 2013
Unicode characters in btye-strings	5	Mar 12, 2010
HOWTO: Parsing email using Python part1	2	Jul 3, 2011
FAQ 9.10 How do I decode or create those %-encodings on the web?	0	Apr 5, 2011
How do I display unicode value stored in a string variable using ord()	133	Aug 16, 2012

How do I decode unicode characters in the subject usingemail.message_from_string()?

Roy H. Han

John Machin

rdmurray

Roy H. Han

Steve Holden

Steve Holden

rdmurray

Roy H. Han

Steve Holden

Thorsten Kampe

Gabriel Genellina

Thorsten Kampe

Tim Golden

Gabriel Genellina

rdmurray

Steve Holden

Thorsten Kampe

Gabriel Genellina

Gabriel Genellina

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads