Are there alternatives to Mail::Box and Email::Folder?

Discussion in 'Perl Misc' started by Trond Michelsen, Sep 7, 2004.

  1. Hi.

    I'm working on a mod_perl based webmail system, which is currently using
    MIME::parser to decode the individual messages. This is working pretty well
    today, but we feel the need to have a more abstract, object-oriented
    inteface to the mailfolders themselves. Well - any library would do, to be
    honest, the main point is to let support systems access the folders through
    the same library as the webmail does, instead of cut'n'pasting code all over
    the place. But, it would also be nice to be able to access other types of
    folders, including POP and IMAP accounts, so a common interface is clearly
    desirable.

    I've been looking at Mail::Box and Email::Folder, and they both look
    interesting. Unfortunately, they use Mail::Message and Email::Simple
    respectively for the individual messages. And since both of these modules
    prefers to have the entire mail in memory, they're pretty much useless in a
    mod_perl environment.

    So - are there any other interfaces like Email::Folder or Mail::Box that
    uses less memory? I'm tempted to try to make something that borrows the
    interface from Email::Folder, but returns MIME::Entity objects, but if
    something else already exists, I'll have a look at that instead.


    BTW: Here is a small test showing the memory usage of Mail::Box and
    Email::Folder
    The folder contains a single message with a size of 22486134 bytes.

    --8<--
    use GTop;
    use Mail::Box::Manager;
    my $start = GTop->new->proc_mem($$)->size;
    printf "Startup: %.1fMB\n", $start/1024/1024;
    my $f = Mail::Box::Manager->new->open(folder => "Maildir");
    foreach my $msg ($f->messages) {
    print "Subject: ", $msg->subject, "\n";
    }
    my $end = GTop->new->proc_mem($$)->size;
    printf "Usage: %.1fMB\n", ($end-$start)/1024/1024
    __END__
    --8<--
    Startup: 5.0MB
    Subject: big mail
    Usage: 1.2MB

    Not that bad, but we're only accessing the header. Once we try to look at
    the body, memory usage goes up.

    --8<--
    use GTop;
    use Mail::Box::Manager;
    my $start = GTop->new->proc_mem($$)->size;
    printf "Startup: %.1fMB\n", $start/1024/1024;
    my $f = Mail::Box::Manager->new->open(folder => "Maildir");
    foreach my $msg ($f->messages) {
    print "Subject: ", $msg->subject, " (", scalar $msg->parts, " parts)\n";
    }
    my $end = GTop->new->proc_mem($$)->size;
    printf "Usage: %.1fMB\n", ($end-$start)/1024/1024
    __END__
    --8<--
    Startup: 5.0MB
    Subject: big mail (5 parts)
    Usage: 205.3MB


    Finally, there's Email::Folder. It's a lot leaner than Mail::Box during
    startup, but it gets really fat once you access the message.

    --8<--
    use GTop;
    use Email::Folder;
    my $start = GTop->new->proc_mem($$)->size;
    printf "Startup: %.1fMB\n", $start/1024/1024;
    my $f = Email::Folder->new("Maildir");
    while (my $msg = $f->next_message) {
    print "Subject: ", $msg->header("subject"), "\n";
    }
    my $end = GTop->new->proc_mem($$)->size;
    printf "Usage: %.1fMB\n", ($end-$start)/1024/1024;
    __END__
    --8<--
    Startup: 2.0MB
    Subject: big mail
    Usage: 214.2MB


    Obviously, if there's something I've missed, and there are other ways of
    accessing the messages, that doesn't require all this memory, or if my
    measurement of the usage is insanely wrong, then I'll be very happy to have
    it pointed out :)

    Oh - and just as a comparison to MIME::parser, here's what it's like if I
    access that particular message through MIME::parser:

    --8<--
    use GTop;
    use MIME::parser;
    my $start = GTop->new->proc_mem($$)->size;
    printf "Startup: %.1fMB\n", $start/1024/1024;
    my $parser = MIME::parser->new;
    $parser->output_dir("/tmp/");
    my $file = "Maildir/cur/1094480066.29295.localhost,S=22486134:2,";
    my $msg = $parser->parse_open($file);
    print "Subject: ", $msg->head->get("subject"), " (", scalar $msg->parts, "
    parts)\n";
    my $end = GTop->new->proc_mem($$)->size;
    printf "Usage: %.1fMB\n", ($end-$start)/1024/1024
    __END__
    --8<--
    Startup: 4.8MB
    Subject: big mail
    (5 parts)
    Usage: 0.5MB

    --
    Trond Michelsen
     
    Trond Michelsen, Sep 7, 2004
    #1
    1. Advertising

  2. Trond Michelsen wrote:
    > And since both of these modules prefers to have the entire mail in
    > memory, they're pretty much useless in a mod_perl environment.


    Even if I'm not able to advise you as regards the best modules, I
    couldn't help wondering what you mean by that. Why would mod_perl
    preclude modules that read messages into memory? A (normal) email
    message is hardly a huge amount of data. Isn't rather the opposite
    true, i.e. in order to be happy with mod_perl, you'd better not be
    short of memory?

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Sep 7, 2004
    #2
    1. Advertising

  3. "Gunnar Hjalmarsson" <> wrote in message
    news:...
    > Trond Michelsen wrote:
    >> And since both of these modules prefers to have the entire mail in
    >> memory, they're pretty much useless in a mod_perl environment.

    > Even if I'm not able to advise you as regards the best modules, I
    > couldn't help wondering what you mean by that. Why would mod_perl
    > preclude modules that read messages into memory?


    Because when you have 50 httpd-processes, it's unfortunate if they all use
    200MB of non-shared memory.

    > A (normal) email message is hardly a huge amount of data.


    But with mod_perl that's not really relevant. Once the process has been
    expanded to 200MB, this memory won't be made available for other processes
    (like the other httpd-children) until that httpd-child is terminated. If you
    have something like 100 simultaneous users, you are likely to hit a big
    message pretty often.

    Besides, when there's an attachment in the message, we're not going to show
    that inline, we'll provide a link to it. We never need to know what the data
    actually is, so there's absolutely no benefit of having it readily
    accessible in memory. And, if we don't write the decoded attachment to disk
    first, we'll have to re-read the message when the user wants to download the
    attachment. Since it's possible to tell MIME::parser where to put the
    decoded parts of the message, we can leave the downloading to Apache.

    There's also the issue of performance. It takes MIME::parser about 1.3
    seconds to parse the 20MB message. Mail::Box spends 8.5 seconds. I've only
    tested Email::Folder with Email::Simple, which uses 1.5 seconds, but that's
    without any decoding of attachments.

    > Isn't rather the opposite true, i.e. in order to be happy with mod_perl,
    > you'd better not be short of memory?


    Sure, but when one solution uses more than 400 times more memory than our
    current solution, it will create problems. Email::Folder is particulary
    problematic here, as it will read the entire message, even if you just want
    the headers. So if somebody just have a single 20MB mail in their mailbox,
    just listing the contents of that mailbox will require 200+MB

    BTW: The max message size on our system is 40MB, and many users do take
    advantage of that, so this isn't just a theoretical problem.

    --
    Trond Michelsen
     
    Trond Michelsen, Sep 7, 2004
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. JaikeC
    Replies:
    1
    Views:
    2,390
    rowdyr86
    Feb 1, 2010
  2. Nancy Drew
    Replies:
    9
    Views:
    158
    Aaron [SQL Server MVP]
    Nov 18, 2004
  3. martin smith
    Replies:
    3
    Views:
    343
    Jeff Cochran
    Mar 21, 2005
  4. Replies:
    4
    Views:
    292
  5. Replies:
    0
    Views:
    111
Loading...

Share This Page