Extracting email attachment when is_multipart() is False

D

Davor Cengija

I need to write a script which should extract the attachment from a text
file, which is saved as MIME mail message. Unfortunatelly,
Message.is_multipart() returns False so msg.get_payload() returns the
complete message. What I need is the attachment only. Is it possible to do
that with standard email package without the actual string level parsing?

This is how my file/message looks like:

====== start here ========
This is a multi-part message in MIME format.

------=_NextPart_000_0026_01C3B347.DBEA9660
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit

CONTENT

signature, etc

------=_NextPart_000_0026_01C3B347.DBEA9660
Content-Type: application/octet-stream;
name="filename.csv"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
filename="filename.csv"

10012;20031118;292.67;4
101;23;19.98;2;39.96
102;24;21.89;4;87.56

------=_NextPart_000_0026_01C3B347.DBEA9660--

====== end here ========

So, I obviously need this part only:

10012;20031118;292.67;4
101;23;19.98;2;39.96
102;24;21.89;4;87.56

Python 2.3.2 on windows.

Thanks and regards,

Davor
 
J

John J. Lee

Davor Cengija said:
I need to write a script which should extract the attachment from a text
file, which is saved as MIME mail message. Unfortunatelly,
Message.is_multipart() returns False so msg.get_payload() returns the [...]
This is how my file/message looks like:

====== start here ========
This is a multi-part message in MIME format.

------=_NextPart_000_0026_01C3B347.DBEA9660
Content-Type: text/plain;
[...]

You seem to be missing the RFC 822 headers (From, To, Subject, etc.).


John
 
D

Davor Cengija

John said:
Davor Cengija said:
This is a multi-part message in MIME format.

------=_NextPart_000_0026_01C3B347.DBEA9660
Content-Type: text/plain;
[...]

You seem to be missing the RFC 822 headers (From, To, Subject, etc.).

Yes, that's true. The question is if it's easier to write a parser for that
kind of messages or to force the message producing application to output the
headers as well. We'll see...

Thanks
 
D

Derrick 'dman' Hudson

John said:
Davor Cengija said:
This is a multi-part message in MIME format.

------=_NextPart_000_0026_01C3B347.DBEA9660
Content-Type: text/plain;
[...]

You seem to be missing the RFC 822 headers (From, To, Subject, etc.).

Yes, that's true. The question is if it's easier to write a parser for that
kind of messages or to force the message producing application to output the
headers as well. We'll see...

You have a third option, which I would try if you can't get the
message producer to do it correctly: slap some RFC822 headers on the
beginning, and then ignore them in the parsed message object. After
all, if the rest of the data is correctly formatted, use the existing
tested MIME parser. Prepending some "bogus" RFC822 headers would be
rather trivial to do.

-D
 
J

John J. Lee

Derrick 'dman' Hudson said:
You have a third option, which I would try if you can't get the
message producer to do it correctly: slap some RFC822 headers on the
beginning, and then ignore them in the parsed message object. After
[...]

Or read the docs & code for the email module, to figure out how to
persuade it to take the messages without the headers.


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,262
Messages
2,571,044
Members
48,769
Latest member
Clifft

Latest Threads

Top