parsing email from stdin


A

Antoon Pardon

I want to do some postprocessing on messages from a particular mailbox.
So I use getmail which will fetch the messages and feed them to stdin
of my program.

As I don't know what encoding these messages will be in, I thought it
would be prudent to read stdin as binary data.

Using python 3.3 on a debian box I have the following code.

#!/usr/bin/python3

import sys
from email import message_from_file

sys.stdin = sys.stdin.detach()
msg = message_from_file(sys.stdin)

which gives me the following trace back

File "/home/apardon/.getmail/verdeler", line 7, in <module>
msg = message_from_file(sys.stdin)
File "/usr/lib/python3.3/email/__init__.py", line 56, in message_from_file
return Parser(*args, **kws).parse(fp)
File "/usr/lib/python3.3/email/parser.py", line 58, in parse
feedparser.feed(data)
File "/usr/lib/python3.3/email/feedparser.py", line 167, in feed
self._input.push(data)
File "/usr/lib/python3.3/email/feedparser.py", line 100, in push
data, self._partial = self._partial + data, ''
TypeError: Can't convert 'bytes' object to str implicitly))

which seems to be rather odd. The following header are in the msg:

Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit

So why doesn't the email parser lookup the charset and use that
for converting to string type?

What is the canonical way to parse an email message from stdin?
 
Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top