Ruby Library for dealing with mbox files

B

Booker Bense

I'm working on rewriting my email filtering hacks and I need a
library that can parse unix mbox file format where the embedded
'\nFrom ' in a message are not quoted. (i.e. one that knows about the
Header Content-Length ).

I've been using the rmail library in

rubymail-0.17

plus my own hack to parse the mbox file. ( rubymail's parser
crashes when give such a file. ) I think I have a fix for
this, but I thought I would check if there are any more recent
projects that can do this.

I poked around Rubyforge and found nothing useful. I am aware of
Tmail, but I'm not sure it solves this problem either. Basically,
I need a ruby version of formail.

If there is a "standard" email handling package, I'd take a look
and see about making it do what I want. At this point all the
email handling packages seem like abandonware...

_ Booker C. Bense
 
G

Garold L Johnson

Booker said:
I'm working on rewriting my email filtering hacks and I need a
library that can parse unix mbox file format where the embedded
'\nFrom ' in a message are not quoted. (i.e. one that knows about the
Header Content-Length ).

I've been using the rmail library in

rubymail-0.17

plus my own hack to parse the mbox file. ( rubymail's parser
crashes when give such a file. ) I think I have a fix for
this, but I thought I would check if there are any more recent
projects that can do this.

I poked around Rubyforge and found nothing useful. I am aware of
Tmail, but I'm not sure it solves this problem either. Basically,
I need a ruby version of formail.

If there is a "standard" email handling package, I'd take a look
and see about making it do what I want. At this point all the
email handling packages seem like abandonware...

_ Booker C. Bense
Did you ever find a solution? I am trying to parse Thunderbird email
files, which are supposed to be standard mbox format, but RubyMail
crashes routinely.
Since the parser doesn't track lines, it is often difficult to find the
problem.

Not all headers have Content-length, but I haven't worked out the
pattern yet.

I don't want to start from scratch as this format is clearly
non-trivial. I have looked at some Perl modules and most of them appear
to have the same problem of not handling unquoted "From " lines
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,528
Members
45,000
Latest member
MurrayKeync

Latest Threads

Top