Handling emails

F

Fulvio

***********************
Your mail has been scanned by InterScan MSS.
***********************


Hello,

I'd like to ask some clue to move further on my project :)
The purpose would be to gather all emails (local and remote ones) to do some
backup.
I've tried to get ideas by reading all about the modules enclose with python,
but neither email framework nor mailbox give me the idea how to handle *one*
email complete of payload and attachment (if any).
This purpose will give me chance to avoid duplicates and I suppose to achieve
the grade of making a maildir o mbox file.
It's welcome to point me to useful information. Surely I'm not asking
ready-made solution :).

BTW sorry to whom feel hurt by the "virus scan banner". That's not generated
by me, but rather by my ISP. I'll try to contact my ISP and ask to move it
off from the email body.

F
 
D

Dennis Lee Bieber

This purpose will give me chance to avoid duplicates and I suppose to achieve

The only "reliable" way to check for duplicate messages, that I know
of, is to compare the message header (sample follows):

Message-id:
<[email protected]>

as the ID is generally created when the message is submitted to the
delivery system.

If the IDs are different with all other parts of the message the
same (well, I'd expect time-stamps to be different also) then the
messages were submitted separately.
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
G

Gerard Flanagan

Fulvio said:
Hello,

I'd like to ask some clue to move further on my project :)
The purpose would be to gather all emails (local and remote ones) to do some
backup.
I've tried to get ideas by reading all about the modules enclose with python,
but neither email framework nor mailbox give me the idea how to handle *one*
email complete of payload and attachment (if any).

The 'PopClient' class here might help you:

http://gflanagan.net/site/python/pagliacci/pagliacci.py.html

A script I use is here:

http://gflanagan.net/site/python/pagliacci/getpopmail.py.html

There's a woeful lack of comments but hopefully you can figure it out!
It saves emails to the filesystem along with attachments. (Note that
the POP account passwords are stored as plain text.) It checks a mail's
message-id and doesn't download if it already has been downloaded.

hth

Gerard
 
F

Fulvio

***********************
Your mail has been scanned by InterScan MSS.
***********************


Message-id:
<[email protected]>

as the ID is generally created when the message is submitted to the
delivery system

Well, that's what I'm dealing with, since a couple of weeks :) indeed.
Some might not have one, but Kmail should had fixed.
The second will be the way to get the whole email from the mbox file. My
problem still remain, partially. I'm not sure if the below procedure will
give me the emails off including multipart ones.

8<-------------8<-------------8<-------------8<-------------8<-------------
#! /usr/bin/env python

from __future__ import generators

import email, re
import mailbox
import email.Message

def getmbox(name):
"""Return an mbox iterator given a file/directory/folder name."""
fp = open(name, "rb")
mbox = mailbox.PortableUnixMailbox(fp, get_message)
return iter(mbox)

def get_message(obj):
"""Return an email Message object."""

if isinstance(obj, email.Message.Message):
return obj
# Create an email Message object.
if hasattr(obj, "read"):
obj = obj.read()
try:
msg = email.message_from_string(obj)
except email.Errors.MessageParseError:

obj = obj[len(headers):]
msg = email.Message.Message()
return msg

header_break_re = re.compile(r"\r?\n(\r?\n)")

def extract_headers(text):
"""Very simple-minded header extraction"""

m = header_break_re.search(text)
if m:
eol = m.start(1)
text = text[:eol]
if ':' not in text:
text = ""
return text

if __name__ == '__main__':
#simple trial with local mbox file
import sys
try:
file =sys.argv[1]
except IOError:
print 'Must give a mbox file program /path/to_mbox'
sys.exit()
k = getmbox(file)
while 1:
full_email = get_message(k.next())
print '%78s' %full_email
answer= raw_input('More? Y/N')
if answer.lower() == 'n': break

8<-------------8<-------------8<-------------8<-------------8<-------------
I admit that not all code is my genuine design, ;-) some other programs
extrapolating have been done.

F
 
F

Fulvio

***********************
Your mail has been scanned by InterScan MSS.
***********************


The 'PopClient' class here might help you:
Thank you, both for the replies.
Gerard,
Surely I'll peep on that code ;-) to gather a wider perspective. A small
negative point is that for pop deals I've gotten a good success :) But I
don't give up, perhaps it there'll be some good idea.
I'm more interested for the IMAP side of the issue, even I might avoid for my
own purposes.

Last, suppose to publish the "thing" where will it be the appropriate site?
At the present my program can do filtering and sincronizing with local MUA
trash. Small problems with IMAP protocol, but it may work for POP3 or IMAP4
on regex filter options. Alfa testers (cooperators) needed ;-)

F
 
D

Dennis Lee Bieber

header_break_re = re.compile(r"\r?\n(\r?\n)")

def extract_headers(text):
"""Very simple-minded header extraction"""

m = header_break_re.search(text)
if m:
eol = m.start(1)
text = text[:eol]
if ':' not in text:
text = ""
return text

Why so much effort when there are modules that handle all that
parsing for you?

Beware of line wrapping
-=-=-=-=-=-=-=-=-=-
# based upon bits from the (Active)Python 2.4 help system and other
code...
import email
import email.Errors
import mailbox

def msgfactory(fp):
try:
return email.message_from_file(fp)
except email.Errors.MessageParseError:
return ""


# simple test

MAILBOX = "C:\Documents and Settings\Dennis Lee Bieber\Application
Data\Qualcomm\Eudora\Junk.mbx"
fmbx = open(MAILBOX, "rb")


mbox = mailbox.PortableUnixMailbox(fmbx, msgfactory)

for msg in mbox:
for hdr in msg.keys():
if hdr == "Received":
for rcvd in msg.get_all(hdr, "NO RECEIVED HISTORY!"):
print "The header field '%s' has the value '%s'" % (hdr,
rcvd)
else:
print "The header field '%s' has the value '%s'" % (hdr,
msg.get(hdr, "NO VALUE?"))
-=-=-=-=-=-=-=-=-=-

Sorry my junk box only has one message at the moment

-=-=-=-=-=-=-=-=-
The header field 'X-Persona' has the value '<Alternate>'
The header field 'Status' has the value 'U'
The header field 'Return-Path' has the value
'<[email protected]>'
The header field 'Received' has the value 'from ESTKIKE
([168.234.230.76])
by mx-avoceta.atl.sa.earthlink.net (EarthLink SMTP Server) with SMTP
id 1gCAzA1IK3Nl34k1
for <[email protected]>; Wed, 25 Oct 2006 00:41:03 -0400 (EDT)'
The header field 'From' has the value '"L´agencia Models"
<[email protected]>'
The header field 'To' has the value '(e-mail address removed)'
The header field 'Subject' has the value 'Modelos, Edecanes y más...'
The header field 'Mime-Version' has the value '1.0'
The header field 'Content-Type' has the value 'text/html;
charset="iso-8859-1"'
The header field 'Date' has the value 'Tue, 24 Oct 2006 22:38:25'
The header field 'Message-Id' has the value
'<[email protected]>'
The header field 'X-ELNK-Info' has the value 'spv=0;'
The header field 'X-ELNK-AV' has the value '0'
The header field 'X-ELNK-Info' has the value 'spv=0;'

--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
F

Fulvio

***********************
Your mail has been scanned by InterScan MSS.
***********************


The 'PopClient' class here might help you:

I got a look rather later. Let me say that is impressively pythonic :)
On the other hand I've to go deep on learning the use of Python classes, as
much as the POPClient likes a different config file and will upset a big
chunk of my code.
Functions are good, I'll study the emails handling, only I need some time to
prepair a method to discern local and remote emails and a way to save/recover
the most important ones.

F
 
G

Gerard Flanagan

Fulvio said:
I got a look rather later. Let me say that is impressively pythonic :)

I don't know if everyone would agree with you, but thanks!

Have you looked at the email.utils module? Some useful functions there.

All the best

Gerard
 
F

Fulvio

***********************
Your mail has been scanned by InterScan MSS.
***********************


# based upon bits from the (Active)Python 2.4 help system and other
code...
OK, good help, Thank you.
As I had confessed, that code was extrapolated from other sources and I took
it for granted as long as is working on my purpose.
Most of my purpose is to get the whole mail. Last function has no use.
Both code doing well the only difference is the way to call
email.message_from_(file|string)

F
 
B

Ben Finney

Fulvio said:
***********************
Your mail has been scanned by InterScan MSS.
***********************

Please stop sending messages with obnoxious headers like this.
 
S

Steve Holden

Ben said:
Please stop sending messages with obnoxious headers like this.
Please stop sending messages with obnoxious content like this.

If you insist on telling someone off publicly via a newsgroup, once is
enough. I agree it's a pain, but Fulvio may not have it in his power to
switch the header off. Mail admins do some incredibly stupid things.

regards
Steve
 
B

Ben Finney

Steve Holden said:
Please stop sending messages with obnoxious content like this.

Yes, I guess I should have expected a response like that from someone
:)
If you insist on telling someone off publicly via a newsgroup, once
is enough.

Apparently not. I tried initially contacting him privately. Then I
tried explaining in a (single) message to the list.
I agree it's a pain, but Fulvio may not have it in his power to
switch the header off. Mail admins do some incredibly stupid things.

There is always the option to not send messages to this list using
that mail server. I don't care what option is taken, so long as the
useless and obnoxious headers on his messages stop.
 
P

Peter Decker

There is always the option to not send messages to this list using
that mail server. I don't care what option is taken, so long as the
useless and obnoxious headers on his messages stop.

--
\ Lucifer: "Just sign the Contract, sir, and the Piano is yours." |
`\ Ray: "Sheesh! This is long! Mind if I sign it now and read it |
_o__) later?" -- http://www.achewood.com/ |
Ben Finney

Ah, but obnoxious footers are OK, I guess.
 
B

Ben Finney

Peter Decker said:
Ah, but obnoxious footers are OK, I guess.

I guess. I maintain that there *is* a qualitative difference between a
spammy header block, irrelevant to the message content but intruding
upon it, at the top of the message; and a dignature block at the
bottom, *after* all the relevant material, separated by a standard
signature block separator.
 
F

Fulvio

***********************
Your mail has been scanned by InterScan MSS.
***********************


There is always the option to not send messages to this list using
that mail server

Once again sorry for that. I'll take action to switch to another mailserver.
Thank for the advice

F
 
B

Ben Finney

Fulvio said:
Once again sorry for that. I'll take action to switch to another
mailserver. Thank for the advice

That would be very much appreciated, thank you.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top