Help on Email Parsing

Discussion in 'Python' started by dont bother, Feb 23, 2004.

  1. dont bother

    dont bother Guest

    Hey,
    I have been trying to parse emails:
    But I could not find any examples or snippets of
    parsing emails in python from the documentation.
    Google did not help me much too.
    I am trying to understand the module 'email' and the
    functions described there to parse email but seems
    difficult.
    Can anyone help me in locating some pointers or
    snippets on this issue.
    Thanks a Ton
    Dont

    __________________________________
    Do you Yahoo!?
    Yahoo! Mail SpamGuard - Read only the mail you want.
    http://antispam.yahoo.com/tools
     
    dont bother, Feb 23, 2004
    #1
    1. Advertising

  2. On Mon, 23 Feb 2004 00:47:17 -0800, dont bother wrote:

    >> I have been trying to parse emails:

    > But I could not find any examples or snippets of parsing emails in
    > python from the documentation.


    Here is a simple program (a bit of a hack) I wrote to count the number of
    messages in a mailbox in each day (used for counting spams). It may be of
    some use to you, although I don't actually parse the message itself, and
    only the headers.

    Jeremy

    # Released under the GPL (version 2 or greater)
    # Copyright (C) 2003 Jeremy Sanders

    import mailbox
    import string
    import email
    import email.Utils
    import time
    import sys

    # open passed mailbox filename
    # (yes - we need checking of this)
    fp = open(sys.argv[1], 'r')

    # open mailbox from file
    mbox = mailbox.PortableUnixMailbox(fp)

    secsinday = 86400
    counts = {}

    # get current time
    nowtime = time.time()

    # iterate over mail messages
    while 1:
    # get next message
    msg = mbox.next()
    # exit if we've looked at the last one
    if msg == None:
    break

    # get received header
    received = msg.get('received')
    # skip messages with no received header
    if received == None:
    continue

    # get unix time of email
    date_rfind = string.rfind(received, ';')
    date = received[date_rfind+1:]
    pd = email.Utils.parsedate( string.strip(date) )

    # skip messages we can't parse the date on
    if pd == None:
    continue

    # get time between now and received date in message
    unixtime = time.mktime(pd)
    day = int( (unixtime-nowtime) / secsinday)

    # increment counter for day
    # (using a dict allows us to parse the messages only once)
    if not day in counts:
    counts[day] = 0
    counts[day] += 1

    # sort days into numerical order
    daylist = counts.keys()
    daylist.sort()

    # print out counts
    for d in daylist:
    print d, counts[d]
     
    Jeremy Sanders, Feb 23, 2004
    #2
    1. Advertising

  3. dont bother

    deelan Guest

    dont bother wrote:

    > Hey,
    > I have been trying to parse emails:
    > But I could not find any examples or snippets of
    > parsing emails in python from the documentation.
    > Google did not help me much too.
    > I am trying to understand the module 'email' and the
    > functions described there to parse email but seems
    > difficult.
    > Can anyone help me in locating some pointers or
    > snippets on this issue.


    this script will extract one or more images
    from an email message given as argument

    hope this helps.



    """Extracts all images from given rfc822-compliant email message.
    A quick hack by deelan

    python extract.py filename
    """

    # good MIME's
    mimes = 'image/gif', 'image/jpeg', 'image/png'

    import email

    def main(filename):
    f = file(filename, 'r')
    m = email.message_from_file(f)
    f.close()

    # loop thru message body and look for JPEG, GIF and PNG images
    images = [(part.get_filename(), part.get_payload(decode=True))
    for part in m.get_payload() if part.get_type() in mimes]

    for name, data in images:
    print 'writing', name, '...'
    f = file(name, 'wb')
    f.write(data)
    f.close()

    print 'done %d image(s).' % len(images)

    if __name__ == '__main__':
    import sys
    if len(sys.argv) > 1:
    main(sys.argv[1])
    else:
    print __doc__



    --
    @prefix foaf: <http://xmlns.com/foaf/0.1/> .
    <#me> a foaf:person ; foaf:nick "deelan" ;
    foaf:weblog <http://www.deelan.com/> .
     
    deelan, Feb 23, 2004
    #3
  4. dont bother

    John Roth Guest

    "dont bother" <> wrote in message
    news:...
    > Hey,
    > I have been trying to parse emails:
    > But I could not find any examples or snippets of
    > parsing emails in python from the documentation.
    > Google did not help me much too.
    > I am trying to understand the module 'email' and the
    > functions described there to parse email but seems
    > difficult.
    > Can anyone help me in locating some pointers or
    > snippets on this issue.
    > Thanks a Ton
    > Dont


    You may want to study the MIME format a
    bit first. It's not a particularly simple format.

    The final example in the email documentation
    seems to be fairly straightforward. The line:

    msg = email.message_from_file(fp)

    does everything and leaves the result in
    memory as objects.

    Of course, this is the *new* email package
    that is in 2.2.3 and later. I don't believe the
    old one was particularly easy to work with.

    John Roth

    ..
    >
    > __________________________________
    > Do you Yahoo!?
    > Yahoo! Mail SpamGuard - Read only the mail you want.
    > http://antispam.yahoo.com/tools
    >
     
    John Roth, Feb 23, 2004
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. GIMME
    Replies:
    2
    Views:
    935
    GIMME
    Feb 11, 2004
  2. dont bother
    Replies:
    3
    Views:
    1,230
    Josiah Carlson
    Feb 29, 2004
  3. Chuck Amadi
    Replies:
    1
    Views:
    322
    fishboy
    Jun 6, 2004
  4. Stuart Clarke

    Email parsing - help please

    Stuart Clarke, Mar 10, 2011, in forum: Ruby
    Replies:
    3
    Views:
    145
    Stuart Clarke
    Mar 11, 2011
  5. pbd22
    Replies:
    1
    Views:
    190
Loading...

Share This Page