To decode the Subject =?iso-8859-2?Q?=... in email in python

Dan Polansky · Apr 20, 2005

When parsing messages using python's libraries email and mailbox, the
subject is often encoded using some kind of = notation. Apparently, the
encoding used in this notation is specified like =?iso-8859-2?Q?=... or
=?iso-8859-2?B?=. Is there a python library function to decode such a
subject, returning a unicode string? The use would be like

human_readable = cool_library.decode_equals(message['Subject'])

Thank you, Dan

Max M · Apr 20, 2005

Dan said:
When parsing messages using python's libraries email and mailbox, the
subject is often encoded using some kind of = notation. Apparently, the
encoding used in this notation is specified like =?iso-8859-2?Q?=... or
=?iso-8859-2?B?=. Is there a python library function to decode such a
subject, returning a unicode string? The use would be like

human_readable = cool_library.decode_equals(message['Subject'])

parts = email.Header.decode_header(header)
new_header = email.Header.make_header(parts)
human_readable = unicode(new_header)

--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science

Roman Neuhauser · Apr 20, 2005

# (e-mail address removed) / 2005-04-20 00:30:35 -0700:

When parsing messages using python's libraries email and mailbox, the
subject is often encoded using some kind of = notation. Apparently, the
encoding used in this notation is specified like =?iso-8859-2?Q?=... or
=?iso-8859-2?B?=.

That's RFC 2047 encoding, both examples introduce an ISO8859-2
string, the first variant says it's ascii-ized using
"Q"uoted-Printable, the other says the string is "B"ase64-encoded.

Is there a python library function to decode such a
subject, returning a unicode string? The use would be like

human_readable = cool_library.decode_equals(message['Subject'])

quoting from http://docs.python.org/lib/module-email.Header.html
[('p\xf6stal', 'iso-8859-1')]

Neil Hodgson · Apr 20, 2005

Dan Polansky:

When parsing messages using python's libraries email and mailbox, the
subject is often encoded using some kind of = notation. Apparently, the
encoding used in this notation is specified like =?iso-8859-2?Q?=... or
=?iso-8859-2?B?=. Is there a python library function to decode such a
subject, returning a unicode string? The use would be like

human_readable = cool_library.decode_equals(message['Subject'])

Here is some code from a front end to Mailman moderation pages:

import email.Header
hdr = email.Header.make_header(email.Header.decode_header(sub))

Neil

Dan Polansky · Apr 22, 2005

Max, thanks; that was helpful. Roman, your explanation was helpful as
well. Dan

HOWTO: Parsing email using Python part1	2	Jul 3, 2011
HOWTO: Parsing email using Python part2	1	Jul 15, 2011
How do I decode unicode characters in the subject usingemail.message_from_string()?	18	Feb 25, 2009
Trouble with UnicodeEncodeError and email	0	Jan 8, 2014
generate and send mail with python: tutorial	8	Aug 11, 2011
japanese encoding iso-2022-jp in python vs. perl	4	Oct 23, 2007
How to send utf-8 mail in Python 3?	2	Mar 5, 2010
2. Re: Python interface problem with Windows (Benjamin Kaplan)	1	Jun 27, 2010

To decode the Subject =?iso-8859-2?Q?=... in email in python

Dan Polansky

Max M

Roman Neuhauser

Neil Hodgson

Dan Polansky

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads