Re: Email parsing library

Discussion in 'Java' started by Rainer Frey, May 27, 2009.

  1. Rainer Frey

    Rainer Frey Guest

    Spud wrote:

    > I need some code that will split out and recognize common things in an
    > RFC 2822 email: headers, body text, quoted sections, signature blocks,
    > mime attachments.


    [...]

    > (No, Javamail doesn't cut it).


    Why?

    Rainer
    Rainer Frey, May 27, 2009
    #1
    1. Advertising

  2. Rainer Frey

    Rainer Frey Guest

    Spud wrote:

    > Rainer Frey wrote:
    >> Spud wrote:
    >>
    >>> I need some code that will split out and recognize common things in an
    >>> RFC 2822 email: headers, body text, quoted sections, signature blocks,
    >>> mime attachments.

    >>
    >> [...]

    >
    > Because Javamail doesn't distinguish quoted sections or signature blocks
    > within the body of the email. It's not designed for deep analysis of the
    > content of an email.


    I overlooked the quote sections and signature block requirement. But as
    there were no alternative suggestions by anyone, I think JavaMail is a good
    base to start. It can read the headers of course, but it also does a
    content analysis of the mail body as deep as mime parts go. With this you
    can extract attachments, and distinguish plain text and HTML mail parts.

    Quoted sections and signatures are part of one single plain text part (which
    is the only part in a plain text mail w/o any attachments). JavaMail
    defines an API (built on Java Activation Framework) for content handlers
    for a certain content type. You could implement and register a content
    handler for text/plain that creates an object representation of the quote
    section, signatures and actual text instead of the default string
    representation. I don't know any existing code for this, but the
    third-party download page for JavaMail lists several desktop and web mail
    clients, some open-source, that might contain s.th. like this. Search
    through http://java.sun.com/products/javamail/Third_Party.html,
    unfortunately not all links exist anymore.

    Then there is the quite mature but discontinued desktop mail client
    ColumbaMail at http://columbamail.org, which seems to contain code at least
    to mark quoted sections.

    Rainer
    Rainer Frey, Jun 2, 2009
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. GIMME
    Replies:
    2
    Views:
    863
    GIMME
    Feb 11, 2004
  2. Naren
    Replies:
    0
    Views:
    567
    Naren
    May 11, 2004
  3. Christopher Diggins
    Replies:
    0
    Views:
    597
    Christopher Diggins
    Jul 9, 2007
  4. Mike Schilling

    Re: Email parsing library

    Mike Schilling, May 26, 2009, in forum: Java
    Replies:
    1
    Views:
    320
    Joshua Cranmer
    May 27, 2009
  5. Roedy Green

    Re: Email parsing library

    Roedy Green, May 26, 2009, in forum: Java
    Replies:
    1
    Views:
    320
    Martin Gregorie
    May 27, 2009
Loading...

Share This Page