We have written a number of java tools for analyzing the contets of an
imap mailbox. What we would like to be able to do is save a particular
email message as a file on our hard drive. Is there a more or less
generic format for doing so and if so, can it be created in java?
(Something like the .eml)
(Unix) mbox format. It is extremely simple, and for a single message
almost boils down to just the messages contents as received.
An mbox file is an ASCII file, containing a sequence of e-mails. Each
entry (e-mail) in an mbox begins with a "From " line (note, no ':' in
that tag). This is *not* the "From:"-header (note the ':' here), but was
originally the UUCP path. Today the SMTP sender should go there if
possible but mail programs often place nonsense into it, or repeat the
"From:"-header.
E.g. a "From " line looks like:
From (e-mail address removed) Fri Mar 28 10:02:15 2006
The "From " line is followed by the mail. First the mail headers, then
the mail body, separated by an empty line. Which is the format as
specified for SMTP in RFC 2822.
At the end of the mail an empty line is perpended.
In case a line in the mail body starts with "From ", the "From " is
perpended with an '>' to avoid that this mail body line is interpreted
as the "From " line which starts a new entry. That quote is supposed to
be removed again when a program displays the mail.
A common extension is that any line in the mail body which already
starts with a quoted from (e.g. ">From ", or ">>From ") is also quoted
one more time, and that a program displaying such a mail always removes
one quote level from a line which starts with a quoted from.
The result looks something like:
From (e-mail address removed) Fri Mar 28 10:02:15 2006
Subject: some subject
Date: 28 Mar 2006 08:02:15 GMT
From: (e-mail address removed)
To: (e-mail address removed)
The mail body>
From now on we do the following things ...
[an empty line]
[next mail, if any, follows here]
Some people hate the format, because it is not "database-ish" enough for
them. But it works like a charm.
The file should contain all information pertaining to the email (from,
to, subject, content, attatchments, etc.)
Attachments are sent inline in a mail body, so that is no problem (mails
with attachments are just MIME mails). The other stuff is in the
headers. In case you need to store additional own management
information, it is typical to invent "X" headers and just add them to
the normal mail headers:
X-Your-App-Name-Something: a value
X-Your-App-Name-Something-else: another value
In case you need per-mbox file information, it is typical to add a
pseudo mail to the beginning of such a file. Your program is then
supposed to know about this mail.
/Thomas